Data analysis in Australian sport is predominantly an AFL-dominated field. Liam, who goes by @pythagoNRL on Twitter and his site, is one of the leaders in Rugby League analytics looking at the NRL from a data-driven lens. It was great getting to know more about his journey. Hope you enjoy the read! If you enjoy this interview, you’ll likely also like our chat with carlosthedwarf.
Tell us about your background
I’m an Electrical Engineer at my day job. It’s not particularly interesting but it pays the bills and has given me a basic skill set to work from. When there are lulls, I can cover the fact that I’m not working by punching my way through a spreadsheet of football stats.
I don’t have a formal background in statistics or much experience in sports. I’ve followed rugby league most of my life, with a big gap in the middle, and try to make time to watch baseball, motor sport and cycling.
How did you get into rugby league analytics?
I was scratching around for a project that involved some writing in late 2016 and saw The Arc, where they were doing Elo ratings for AFL. I checked to see if anything similar existed for rugby league and found there wasn’t anything great. So I started my own thing. Since then, my own curiosity has gotten the better of me and led to me expanding into all sorts of areas that I had no idea about.
Which sports do you analyse and what types of problems have you looked into
I stick to rugby league because I like it the most and it’s the most underdone. The most popular pieces I’ve written were whether to take the two or go for a try when down by eight in the closing stages of a match, analysing the Broncos’ deficiencies in 2019 and plotting the performance of clubs against the tenure of their head coach.
My gut tells me there’s a long way to go with accepting that maths can be useful to jocks.
What are your favourite resources to tap into for help with rugby league analytics?
I started by copying Matt Cowgill. I spent a lot of time trying to get to grips with the frameworks for baseball analytics and borrow occasional stuff from NFL, AFL and Soccer. Tom Tango (baseball, hockey), FanGraphs (baseball), Michael Lopez (US football) and HPN (AFL) are good to follow in this space. Jon Bois’ (SB Nation) out there approach is also useful if you need to get creative. Or just check out ‘Moneyball’.
What tools and modelling or analysis methods do you use for sports data analysis?
I spend a lot of time in Excel because Google Sheets isn’t up to what I’m doing and I don’t have the skills to use anything more efficient to plough through the volume of data I’ve been working with. Usually plotting two things against each other or calculating the coefficient of correlation is enough.
Thoughts on the Australian Sport-Tech environment? Where do you think we could grow or learn from other markets?
I’m not in too deep to that space, so it’s hard to say, but my gut tells me there’s a long way to go with accepting that maths can be useful to jocks. There’s also a real lack of cohesion across the community that prevents us from building on, or critiquing, each other’s approaches. Part of that is a lot of the work is proprietary and part of that is poor communication. Or it could just be me? I’ve been trying to publish my numbers so others can build on it if they so desire.
What advice do you have for the couch fan who wants to start playing around with some data
The tricky bit is getting the data in first place. You can do a lot just with score lines, if you have enough of them, but getting into player data is laborious and getting play-by-play is nearly impossible for normal people.
After that, automate as much as possible, think critically about what the data says and how you’re using it and do what interests you. It’s likely that others will be interested too.
If you had to pick one sporting code or team and solve a problem for them through data analysis, who would it be and what would the problem be?
My current pre-shutdown focus was working on rugby league player ratings. I’ve done a first pass on a Wins Above Replacement-style metric for rugby league, as well as trying to project future performances based on past data and how that affects teams. The next step would be to look at how those projections work across competitions as players move between first and reserve grade or under 20s. I would love to have more data and more time to really thrash that out.