We catch up with passionate rugby league analyst ‘CTO‘ to talk all things rugby league and data! His unique approach to analysing rugby league is an engaging story for both tech-savvy fans and purists alike! Check out his site for more detailed analytical and tactical rugby league content.
Tell us about your background
I have a slightly different background to a lot of people who end up in sports analysis. I am a Pharmacist who graduated with a Bachelor of Pharmacy from James Cook University in 2014, living and working in Townsville. Originally, I started working as a Community Pharmacist but now have branched into more of the managerial side of the Pharmacy business. While health-science and sport are structurally unrelated, I think it has been advantageous having a different background and approaching the NRL with different methods unique to my field like epidemiological models not often adapted to sport.
How and when did you get into data analysis in sport?
Like a lot of people on the non-professional side it was through the realm of gambling. As a sport loving college student with basically no money the idea of being able to make some extra drinking money through sports betting was always attractive. After learning statistics and epidemiology at University I immediately tried to adapt and apply it to gambling in 2014 with limited success. Funnily enough I never really made any in-roads until I was working and didn’t really need the money. I think it took that long to have a proper grasp of what I was dealing with and found methods and models that were successful. In 2017 I launched into statistical analysis specifically with the NRL and have diversified and continued to quantify teams and players with betting applications for the last 3 years.
We’ve finally got some live sport to get excited about. What are some of your data driven predictions for the first few rounds?
Parramatta will regress, likely significantly from their current standing. They have two key numbers which are unsustainable, currently they’re sitting at 60% possession across their first 3 games and have run for nearly 2000 run metres per game. Both of these numbers will not continue for a full season.
The Knights will play finals. They have sustainable numbers across their first three rounds and have produced some fantastic results. They’re currently 3rd in Net Rating at 12.66 points per game and their opponents so far (NZW, WST, PEN) have all shown themselves to be adequate to quality teams in the other games they’ve played. Factoring in strength of schedule Newcastle are currently first in Schedule Adjusted Rating across the first 3 rounds, definitely a team to watch moving forward.
Reading your tactical preview of the NRL Grand Final in 2019 you raise some really interesting points. What are some key metrics you think are underrated in the coverage of rugby league?
Defence. It’s always fascinated me how we always view the NRL (and other sports) through an attacking lens first and foremost. When you think about the best players in the game your mind always drifts towards the more enigmatic attacking players like James Tedesco, Cameron Smith, Damien Cook, Kalyn Ponga ect and not those who succeed defensively. This also manifests in statistical analysis as well; coaches, analysists and commentators always highlight attacking statistics like run metres or try assists. For the most part defensive commentary is rarely statistically-based outside of antiquated numbers like tackles made and tackles missed.
Quantifying defence is more complex, but it also has the most value. It’s also no secret that from a success stand-point defence is arguably more important. Since 2006, 12 of the 14 premiers have had a top-2 defence, and since 1998 the average defensive rank of the eventual premiers is 2.7 compared to 3.1 for attacking rank. Some of the more successful metrics I use are primarily defensively based, combining under-utilised statistics like one-on-one tackles and penalties conceded to help provide a complete view of a player’s defensive value.
The Gold Coast Titans, and the problem is basically everything. If they were a house, they’d be a renovator’s dream. I think the appeal of that market is really enticing if they were able to achieve success and given their low expectations you could experiment and think differently to a lot of other teams – functionally you can build them from scratch.
What tools and modelling/analysis methods do you use?
As I mentioned before most of my models are grounded in epidemiology or statistical probability analysis often used in health. Most of what I do isn’t complicated, and I don’t have a background in big data analysis or complex coding like some others. But I do have experience in using small population samples to identify patterns and variables like you would studying risk factors for a disease or assessing the potential efficacy and safety of a medication from a clinical trial.
In sport you are doing the same. You’re not dealing with big datasets you’re actually dealing with incredibly small ones and using that incomplete data to project forwards. Take this season for example, we’ve only had 3 games of sampling so far and for most teams only 24 games of sampling last year which in itself is flawed data given the changes teams may have undergone from last year to this year. It’s an incredibly small sample so a lot of what I do is based around sustainability and regression, identifying outlying numbers (both good and bad) that will be present in such small samples and their expected regression as the sample widens.
Something else which I’ve found is often a flaw of sports analysis is the over-reliance on mean (average) numbers which are flawed in such small sets. Mean numbers can be heavily influenced by outlying numbers and have little value, what we really want to know is a team or players expected performance. Using simple regression techniques and normal distribution plots can help identify a team or players expected performance week to week.
What advice do you have for the couch fan who wants to start playing around with some data?
Test your ideas. If you already know how to use Excel or a spreadsheet you’re most of the way there. That’s the cool thing about data, it gives you the ability to test any theories you have about the game and if you keep going you’ll stumble into some gems.
Who do you support and how do you think they’ll go in 2020?
I support the Cowboys. It was pretty grim for the first 20 years of my life but it’s had some moments lately. They’re in a better position then they have been the last couple of years. They have a strong mobile middle which is better suited to the current game and while they’re integrating new players in key positions (again) they’ve got some potential. With their squad they should be in finals contention, but their attack needs to veer away from the rigorous block-plays which is the hallmark of Paul Green’s coaching.
If you had to pick one sporting code or team and solve a problem for them, who would it be and what would the problem be?
The Gold Coast Titans, and the problem is basically everything. If they were a house, they’d be a renovator’s dream. I think the appeal of that market is really enticing if they were able to achieve success and given their low expectations you could experiment and think differently to a lot of other teams – functionally you can build them from scratch.
Ideally I would try and use my understanding and quantification of defence to recruit undervalued defensive players and see if the team can build a defensive foundation moving forward to allow them to compete. Easier said than done, but I’d love to try.