Post Season Wrap – 2018

I haven’t posted too much here during the AFL season, mainly, since I had bedded down the updated systems and was able to continue things as normal through the season there wasn’t too much too discuss about the back-end. But now as the season has wrapped up and the dust has settled it is of course time to reappraise and reset priorities.

The Finals

Before I get stuck into the nerdage, it’s worth looking at the how the actual footy turned out through the finals. The lead-up to the grand final were a bit anticlimactic, both preliminary finals being virtually sealed by half-time; Collingwood’s win possibly being more shocking as they were seen as slight underdogs going up against Richmond. That landed us with a grand final between West Coast and Collingwood, two clubs that both have, how shall we say, polarising fandoms.

Without my own team in the fight, I usually tend towards to the team that had gone longer without a flag, although with West Coast last saluting in 2006 with Collingwood doing so in 2010, there wasn’t much of a gap. So basically my view was, with the Melbourne fairytale snuffed out with napalm, after a fairly lacklustre finals series, the best we could hope for was a good close grand final. It delivered on that in spades.

It didn’t look like that early in the first quarter, of course, with the Pies putting on the first five goals, in a performance very reminiscent of both of those prelims, but West Coast eked out two before the first change, and after that it was a contest as the Eagles ground away at Collingwood’s early advantage. Sheed threading that final goal from the flank with Pies supporters hooting at him will go down as one of those great clutch acts that decides a premiership.

Nerdage

Model wise, GRAFT had an OK season, compared with the other models on the Squiggle Models Leaderboard. It was doing pretty well early in the season (particularly in BITS) but fell adrift later on, finishing with 142 tips, a significant gap to the leaders on 147 and 146. With most models hovering around 70% it was a more predictable season than 2017, yet still had a lot of interesting results as there were still up to twelve teams that had chances of making the finals up until the last one or two rounds of the home-and-away.

While GRAFT has essentially remained unchanged in principle even during its single factor days as RAFT, and since I want to keep it’s simplicity, there does come along certain events that make me think about spinning off a hybrid system that can deal with particular instances. That’s right, I’m looking at Geelong bullying away in their last two home-and-away games, stealing top spot in the GRAFT ratings, and then punking out in the Elimination final.

Looking at the ladder it’s easy to figure out what happened – Geelong only got 13 wins, in the 18 team era just sufficient to be considered fringe finalists (as they were) but they did so with a fairly healthy percentage of 131%. That marks a pythagorean anomaly (a what?) of -3 wins. Not as high as Brisbane’s -3.8 (that is, Brisbane won 5 games but on percentage should have had 9 wins) but nevertheless.

On this point it draws attention to GRAFT’s main weakness in that it doesn’t pay any credence towards wins and losses – it only cares about scores, and when a team runs the score up, as Geelong did against Fremantle (by 133) and Gold Coast (by 102), what is the actual difference between thrashing a team by twenty goals instead of ten? Anyway. That will be part of my homework for the off-season – not really an off-season, as I am about to detail.

Offseason Training

As far as the AFL-specific work goes, while the Gamma probability model worked really well, there are computational issues with working out the margins likelihoods – and therefore the win probabilities). This is because the model being based on two curves for each team’s potential score, the equations for those are well-defined, the difference of the two curves, not so much. For each game, I have to run a brute-force run, so for instance to predict the likelihood of a team winning by 30 points, I have to sum the probability of 90-60, 91-61, 92-62, etc.

It works fine once it’s on the front end but it seems to me that I should be able to figure out an actual equation for the difference curve and refer directly to that for the margin probabilities, thereby saving the computer a lot of crank when I update the tables. So basically, getting out my old calculus and statistics texts and trying to relearn everything I didn’t pay sufficient attention to. (Or just getting Wolfram Alpha to do it, although I still have to figure out the principles first).

Along with all of that, showing the actual probability curves on the site is on the list of things to do – a lot of concepts have been completed as to how these will actually look, however I am not satisfied with how they look just yet, particularly as the risk of misinterpretation is real. The current match tables are basically a soup of numbers and I will do my usual overhaul of the website to try and make them more comprehensible.

Also, I don’t have any historical tables up here apart from the archives of the previous seasons’ sites, so this is also on the agenda. I do think it would be good to make that data available and have historical graphs and records so you can compare clubs across seasons. We’ll see how that goes. I am thinking that I will probably use 1987 as the starting point for the public tables as this was when West Coast and Brisbane Bears entered the competition.

Meanwhile, there are also the summer sports to look at. I am in the second year of basic Elo tracking of the A-League and intend to do so for the W-League. I am cutting it fine as the season starts this week in the NBL, which might actually be a good place to start developing that hybrid system, although for now I am just going to stick with Elo tracking for that as well. Obviously the W-League and WNBL as well.

As far as the cricket goes, specifically the Big Bash League, this is something that could happen. I have a month or so to put something in place, which will have to include an analysis of previous results and all that, so it’s maybe a 50/50 chance at this point.

As this will necessarily involve further modularising of the code, so if that is in place, in the new year I can look at certain other competitions that have up to this point escaped notice. News on those further developments to come.

Towards A New Model – Part 3

In previous posts on this matter, I settled on the gamma distribution as the model, as it was a continuous analogue of the poisson curve, and provided a curve that reflected actual real world scores.

The next little challenge was to calibrate it against the “par” scores that GRAFT generates. The par scores typically range from 50 to 120 and are a little narrower than the distribution of actual scores. There were a few more quirks that came in when I was trying to make it all fit. Basically I took 6-point wide slices of each prediction set, and set them against the actual scores against which I would try and fit the gamma curve. Fortunately, at least for the 60 – 120 par range there was a reasonably consistent look to the curves, as I will show. I confined the set of historical scores to start from 1987 to 2017.

(The black curve is the fit against all scores)

A little bit haywire at the extremities, but not too bad for the main spread. I looked at the factors for the distribution, particularly the shape parameter a, to come up with something plausible.

After a bit of eyeballing a sensible function for a, I’ve set the magic function with 7.5 as the scale

def curve(rating):
    a = rating * .1 + 3
    return scipy.stats.gamma(a, 0, 7.5)

(This is Python code of course, you’ll want the scipy library to play with this. I presume, if you use R you’ll be able to figure out how to implement this.)

So in the end we have this bad boy:

And overlaid on the actual result fits…

So that’s not too bad a fit, at least for our purposes.

A few quirks, though; the mean and median of each curve is actually regressed towards the global mean compared with the GRAFT par that was spat out. I have a feeling that this is counteracted by the spread of scores at each GRAFT tick, where a lot of scores come in slightly under average, but this is balanced by the long tail when teams go on the march and score 20 goals or more.

Well, let’s just that I haven’t put too much rigour into it before this season. There’s still a few things to mess around with, like for instance I basically set up curves for each team’s rating against each other completely independently, so there isn’t any covariance taken into account at this stage. What happens when two defense-orientated teams face off, versus two attacking teams? Well, that’s already kind of accounted for at the par setting stage (team.attack – opposition.defense + league_par).

I’ve gotten as far as cooking up some rough examples for the opener on Thursday night:

From the main site, Richmond’s par is 90.8, Carlton’s par is 58.8.

Based on all of the above, Richmond’s likelihood of winning is 76%, with a mean winning margin of 24. Well, that’s our first bit of weirdness there: GRAFT states that Richmond are better by 32 points, so what’s happening there?

The same sort of thing is happening with the base scores: Again, the idea of GRAFT is that each par score represents what each team should score against each other, given past performances, with the weekly adjustments carried out by a factor, calibrated to maximise the amount of correct tips. It’s a little fuzzier on margins even though margins are at the core of the result.

Richmond’s par (rounded off) is 90.8, but the mean of their curve is 90.6 (OK, not too crazy) and the median is 88.1.

As for Carlton, with a par of 58.8, their mean is 66.6 and their median is 64.1.

We’re fine with the median being a little skewiff, it’s not a symmetrical distribution, and the elongated tail is consistent with real scores. (That’s why I picked the gamma distribution, after all, as it was such a good fit for the historical record)

 

Very early drafts of how the graphs might look on the site. I have coded these at a low level in Pillow because I am a masochist. Also because matplotlib plots like the ones earlier in the post look like matplotlib plots.

So obviously there’s regression to the global mean happening each week. Thing is, the weekly ratings are updated according to weekly results based on the actual scores, based on factors which boil down to *Hulk voice* “big better than small”. While I’ve tweaked how the ratings are presented this year (dropping them by a factor of 10 and making them floating numbers) that part has essentially not changed.

What I was dissatisfied with was how the probabilities of individual games and the season as a whole were calculated. That practice of using a normal distribution which would not tail off nicely at zero so these required a horrible fudge of jacking up both teams’ scores. I am glad to be done with that nonsense.

With this new algorithm, I am at the point where I am happy to publish the probabilities, intending to roll them out by Wednesday (boy this blog entry is going to become dated fast), but at the same time they probably need a little more rigour applied, so caution is advised if you’re going to take heed of them for certain activities. Entertainment purposes only!

Another thing that will happen on the site is that I will be producing CSV files for the tips (with lines and likelihoods) and ratings for people to scrape if they like – a little more detail on the the formats and such in the next post.

Offseason Training

What’s been happening lately at GRAFT?

Of course, I’ve set up the A-League site, and have been incrementally adding new things. Most recently putting together a SVG graph with the weekly fluctuations in the Elo ratings.

As well, I’ve made a some steps towards a decent prediction model. As you’d imagine “football” modelling is very mature and there is a great deal of literature on that, but you know, it doesn’t hurt to devise a system from first principles, even if you find out you’ve only reinvented the wheel at the end of it.

Elo is particularly good at giving you win/loss probabilities out of the box, but of course there’s the whole issue of draws to account for. On the whole draws seem to eventuate 25% of the time, which is a nice round figure.

This is one of the things I hope to deal with as I want to incorporate attacking/defensive ratings into the mix, in an attempt to create more plausible projections.

My theory is that a contest between two teams with attacking tendencies is less likely to result in a draw than between two more defensive teams. The reasoning should be fairly intuitive, if neither club gives a shit about stopping balls flying into nets, in such a shootout it’s less likely that the teams will finish on the same score.

As well, two evenly matched teams would be more likely to play out a draw than in a match where one team is of a much higher quality than the other, even when taking bus parking arrangements into consideration.

Anyway, that’ll take a bit of nutting out, although at the end, I want to be able to show a par score for each team prior to each game (much like I do with the AFL) although it’d look something like MVC 1.2 v SYD 2.1, with maybe the result probabilities alongside. The numbers would align to the poisson distribution of how many goals they might actually score. Once you get to that point, doing Monte Carlo predictions on that basis becomes pretty simple.

Among other moves for next year:

Revamp the site for the AFL/AFLW seasons in the new year. I mean, yes, it worked out for 2017, but I’m kind of bad about leaving things alone. Besides the whole “rating footy teams” part, I’m into this thing as much for developing new visualisations and designs. Basically I’m experimenting on all that in public. Some people have weird hobbies and this is mine.

Something else I will try and do is also provide CSV output of my tips and ratings into a set format so it’s easier for others to scrape and utilise rather than break everything whenever I mess around with the “look and feel”.

I consider the basic GRAFT system to be pretty much settled, for all its faults and limitations. That’s tied up with the basic philosophy that it takes into account only results and venues, however. It is what it is.

Which is not to say that I won’t have new things on the go, including other systems based off it or on completely different principles, but at the end of several years of development and refinement, the core is done. I do want to publish the basic algorithm (it really is absurdly simple) and some associated tools for others to examine and rip to shreds, but I am a horrible coder so there’s a bit of cleaning up to do before that happens.

Having said that, the main thrust of development is the projection system. I intend to overhaul the system that I use to work out my probabilities and eliminate some of the more egregious fudges. I’ll have more detail here during analysis and development, but the first aim is to move on from the “close enough is good enough” normal distribution that I have used up to this point.

As far as AFL projections go, I will probably stick to a continuous distribution. There’s a few that might fit the bill, but that’s yet to be figured out. Maybe gamma, maybe logistic. There’s a lot of number crunching to be done for that but I’m going for something that starts at zero and follows the historical curve, has some amount of covariance between the two teams, and of course it’ll need to correlate nicely with the GRAFT ratings.

Another objective with the AFL section is to flesh out history sections for the league in general and also under each club page. Not quite sure how to present all that, I probably won’t go all AFL Tables on you because, well, we already have AFL Tables, but having the ratings into a historical context would be interesting. Again, that’s a thing that will develop over time.

Aside from all the AFL stuff, there’s also the intention to¬† branch that other winter game – since I do want to have a more general view to the site. There’s a few things to work out how to tackle the NRL ratings, but I think some kind of hybrid approach will be needed there. Historical data is a little harder to find and organise so that’ll be the first thing to sort out. Of course this means I will might actually have to get enthusiastic about League again, which has been a struggle since it tried to go Super.

Of course in taking a more general approach across the sports, I have to sort out this shiny new site so people don’t get lost around here. Getting into the web dev side is pretty interesting; I’m using Bootstrap 4 as the basis for now since it’s reasonably easy to set things up so it doesn’t look too broken in small screens.

Aside from the framework, when it comes to generating the website pages, I’m committed to using all sorts of spaghetti code that I have trouble comprehending when I look it again after a break. Well, as I said, horrible coder. I probably do all sorts of things would make seasoned pros scream. “What’s unit testing, Precious?”

Fortunately it’s not my day gig, and I’m not looking for one.

Anyway, that’s what’s on the whiteboard for the next few months. In the next week or two I will have a poke at the 2018 AFL fixture and see how that stacks up and then announce my usual anodyne opinions about how nothing really matters anyway and thank buggery they haven’t inflicted us with 17/5 just yet. Bet you’re looking forward to that.

Changing of the Guard

So, that’s the winter sports all wrapped up for the year – at least in theory.

We saw Richmond triumphant in a game that saw them break down Adelaide’s vaunted attack while keeping up their own end. In particular, Dustin Martin had a fine game to sweep the treble of the Norm Smith, although like many I thought Bachar Houli might been a bit stiff missing out on that as he and Alex Rance did much to blunt the Crows’ path to goal. It was quite telling that when Martin turned up to the post conference he bore only the premiership medallion. Too much bling is gauche.

(As for the conspiracy theory that somehow Adelaide were stiffed by… something to do with umpiring or the higher gravity in Victoria or what, maybe if they hadn’t lost by eight goals, there might be grounds to establish an argument.)

Of course after Footy Christmas came Footy Boxing Day, with the Melbourne Storm just basically crushing the North Qld Cowboys, as seemed to be expected. A bit dull to watch but a nice send off for Cronk, Cam Smith and Slater. Also completing the Swan Street double after the Tigers’ incredible triumph sent that part of Melbourne into raptures – and without any overturned, burned out cars! Good work!

As far as GRAFT goes, I’ve decided to keep myself busy over summer by keeping track of the A-League which kicks off this weekend. The GRAFT system that I use for AFL doesn’t work so well for association football so I’ll be running off everyone’s favourite ranking algorithm as devised by Arpad Elo. I’m unlikely to bring too many enhancements beyond the off-the-shelf formula, although I do want to work out a method to track attacking/defensive alignment along the way. As such, the ratings for this season will be a work in progress as I tune things along the way. So it’ll be in alpha phase for this season.

I also have a bit of homework to do over summer as far as the AFL ratings go, mostly where it comes to projections and that. As I tweeted a while ago, the random algorithm I use for the Monte Carlo sims needs to reflect actual scoring distributions more closely so that’s going on the to-do list.

2017 Grand Final Preview and beyond

According to the main GRAFT ratings, we have Richmond rated as 3 points better than Adelaide, but that is based on a home ground advantage since Richmond are playing at their home ground against the interstate team. If you otherwise consider grand finals to be neutral in general, then you would remove the standard 12 point HGA factor, then it’s Adelaide by 9. That seems more or less right to me – the Crows have objectively been the best side for most of the season (although with a few strange missteps along the way), but the Tigers have also been consistent through the season with some excellent performances of late.

Going by the Season Ratings based on the home-and-away record and not accounting for finals results, there’s a bit more of a gap in perceived quality as it shows Adelaide at 1186 and Richmond at 995, holding Adelaide as the better side by 7 points (or 19 without HGA).

From the gut feeling point of view I would probably put Adelaide ahead as well. They have been one of the more attacking oriented sides of recent seasons, so shutting down those multiple avenues at goal is something Richmond will need to do – and they’re quite capable of that. So what I’m hoping for is basically a close game all the way through – fans of either particular team would probably like their team to work up a decisive advantage early on so they can soak up the rest of the game, but that’s not much fun for the rest of us.

As I’ve mentioned a couple of times on the toots, up to this point Richmond and Adelaide had the two longest current grand final droughts in the league, Richmond not been in a grand final since 1982 (capping off their greatest era) and Adelaide since completing their successful back-to-back campaign in 1998. Prior to this time last year, it was the Western Bulldogs who had that dubious honour which they dispelled in style. After Saturday, though, the honour will pass jointly to North Melbourne and Carlton, who last competed in 1999 – which doesn’t seem that long ago, actually. Maybe I’m getting old…

In that respect, that 16 clubs have managed to play off in the last 20 years has been fairly remarkable, although of course we’ve also seen dynasties from the Brisbane Lions, Geelong and Hawthorn in that time. Only the two newest expansion clubs haven’t made it to the last game of the season, and of course GWS have gotten pretty close the last couple of years so they should be primed to do so next year.

As for premiership droughts, well, Melbourne and St Kilda have still waiting since the 1960s even though they have both played a number of grand finals. Richmond’s wait of 37 years isn’t quite up there but given the vast number of fans, the frustration is palpable, as will the relief if the Tiges actually come through on Saturday. To do that, though, they will have to overcome an Adelaide side that has also suffered its ups-and-downs, such has other clubs nicking off with its best players and tragedies such the passing of Phil Walsh.

It’s going to be an interesting contest between two proud and well-supported clubs who, it has to be said, have underachieved through the years.

Oh, and I prepared a bingo card, of course.