Post Season Wrap – 2018

I haven’t posted too much here during the AFL season, mainly, since I had bedded down the updated systems and was able to continue things as normal through the season there wasn’t too much too discuss about the back-end. But now as the season has wrapped up and the dust has settled it is of course time to reappraise and reset priorities.

The Finals

Before I get stuck into the nerdage, it’s worth looking at the how the actual footy turned out through the finals. The lead-up to the grand final were a bit anticlimactic, both preliminary finals being virtually sealed by half-time; Collingwood’s win possibly being more shocking as they were seen as slight underdogs going up against Richmond. That landed us with a grand final between West Coast and Collingwood, two clubs that both have, how shall we say, polarising fandoms.

Without my own team in the fight, I usually tend towards to the team that had gone longer without a flag, although with West Coast last saluting in 2006 with Collingwood doing so in 2010, there wasn’t much of a gap. So basically my view was, with the Melbourne fairytale snuffed out with napalm, after a fairly lacklustre finals series, the best we could hope for was a good close grand final. It delivered on that in spades.

It didn’t look like that early in the first quarter, of course, with the Pies putting on the first five goals, in a performance very reminiscent of both of those prelims, but West Coast eked out two before the first change, and after that it was a contest as the Eagles ground away at Collingwood’s early advantage. Sheed threading that final goal from the flank with Pies supporters hooting at him will go down as one of those great clutch acts that decides a premiership.

Nerdage

Model wise, GRAFT had an OK season, compared with the other models on the Squiggle Models Leaderboard. It was doing pretty well early in the season (particularly in BITS) but fell adrift later on, finishing with 142 tips, a significant gap to the leaders on 147 and 146. With most models hovering around 70% it was a more predictable season than 2017, yet still had a lot of interesting results as there were still up to twelve teams that had chances of making the finals up until the last one or two rounds of the home-and-away.

While GRAFT has essentially remained unchanged in principle even during its single factor days as RAFT, and since I want to keep it’s simplicity, there does come along certain events that make me think about spinning off a hybrid system that can deal with particular instances. That’s right, I’m looking at Geelong bullying away in their last two home-and-away games, stealing top spot in the GRAFT ratings, and then punking out in the Elimination final.

Looking at the ladder it’s easy to figure out what happened – Geelong only got 13 wins, in the 18 team era just sufficient to be considered fringe finalists (as they were) but they did so with a fairly healthy percentage of 131%. That marks a pythagorean anomaly (a what?) of -3 wins. Not as high as Brisbane’s -3.8 (that is, Brisbane won 5 games but on percentage should have had 9 wins) but nevertheless.

On this point it draws attention to GRAFT’s main weakness in that it doesn’t pay any credence towards wins and losses – it only cares about scores, and when a team runs the score up, as Geelong did against Fremantle (by 133) and Gold Coast (by 102), what is the actual difference between thrashing a team by twenty goals instead of ten? Anyway. That will be part of my homework for the off-season – not really an off-season, as I am about to detail.

Offseason Training

As far as the AFL-specific work goes, while the Gamma probability model worked really well, there are computational issues with working out the margins likelihoods – and therefore the win probabilities). This is because the model being based on two curves for each team’s potential score, the equations for those are well-defined, the difference of the two curves, not so much. For each game, I have to run a brute-force run, so for instance to predict the likelihood of a team winning by 30 points, I have to sum the probability of 90-60, 91-61, 92-62, etc.

It works fine once it’s on the front end but it seems to me that I should be able to figure out an actual equation for the difference curve and refer directly to that for the margin probabilities, thereby saving the computer a lot of crank when I update the tables. So basically, getting out my old calculus and statistics texts and trying to relearn everything I didn’t pay sufficient attention to. (Or just getting Wolfram Alpha to do it, although I still have to figure out the principles first).

Along with all of that, showing the actual probability curves on the site is on the list of things to do – a lot of concepts have been completed as to how these will actually look, however I am not satisfied with how they look just yet, particularly as the risk of misinterpretation is real. The current match tables are basically a soup of numbers and I will do my usual overhaul of the website to try and make them more comprehensible.

Also, I don’t have any historical tables up here apart from the archives of the previous seasons’ sites, so this is also on the agenda. I do think it would be good to make that data available and have historical graphs and records so you can compare clubs across seasons. We’ll see how that goes. I am thinking that I will probably use 1987 as the starting point for the public tables as this was when West Coast and Brisbane Bears entered the competition.

Meanwhile, there are also the summer sports to look at. I am in the second year of basic Elo tracking of the A-League and intend to do so for the W-League. I am cutting it fine as the season starts this week in the NBL, which might actually be a good place to start developing that hybrid system, although for now I am just going to stick with Elo tracking for that as well. Obviously the W-League and WNBL as well.

As far as the cricket goes, specifically the Big Bash League, this is something that could happen. I have a month or so to put something in place, which will have to include an analysis of previous results and all that, so it’s maybe a 50/50 chance at this point.

As this will necessarily involve further modularising of the code, so if that is in place, in the new year I can look at certain other competitions that have up to this point escaped notice. News on those further developments to come.

FOO Was Here

OK, after all the tedious talk about statistical distributions and aligning them to real life, I felt like doing a dumb fun post about the TLAs (three-letter acronyms) that I picked out here, with a bit of historicity. Unlike national ISO or IOC codes, there isn’t really any set schema for how the AFL clubs (or indeed NRL clubs) are labelled in this way. This was bothering me a little bit when I prepared the CSV file for aggregating my tips, since whoever was using them would probably have to do some mapping.

Most media outlets tend to go with abbreviations of two, three or four letters, which for the most part works fine. Things get weirder for the AFL’s official hashtag components, quite often I myself get mixed up between North/Roos or Bombers/Dons and while the AFL would probably make some stupid choices if they moved to the TLA hashtag scheme at least they’d easier to boil down.

But particularly when I started working on this stuff in the early 00s and doing very basic text outputs, the TLAs were where it was at for formatting the tables and whatnot. Consequently, I applied these over to the website when I finally set it up.

AFL

Most of the ones I used for the single-word teams are pretty straight forward, – CARlton, COLlingwood, ESSendon, etc.

With the teams I’ve made choices that differ from other outlets, so in this case it’s probably worth an explanation.

BRL obviously stands for Brisbane Lions, to differ them from the Be Right Backs, I mean, the Brisbane Bears. Keeping the L for the Lions is also a bit of a nod to Old Fitzroy, if the business name of the club, the Brisbane Bears Fitzroy Football Club Ltd wasn’t enough. (With respect to the grassroots VAFA club who can be considered the true Fitzroy Football Club these days).

STK for St Kilda, NME for North Melbourne and PAD for Port Adelaide make sense for easy recognition although I note other orgs use other distillations.

WBD is not my preferred abbreviation for the Western Bulldogs, because I would rather they went back to FOOtscray (no, the VFL side doesn’t count) and truly recapture the dignity of the heart of Melbourne’s working-class, multicultural west. Sorry, over 20 years and even a flag later I’m not a fan of the marketing term. Still, the Westerns are what they’re branding themselves as for now, so we’ll roll with it.

For the other Western club, GWS is pretty much a no-brainer of course, although it was interesting to hear some talk about them just going with the Giants branding. GIA would be the most likely option for me. Also probably just as well the archaic Cumberland name wasn’t used because oh dear god.

Their silvertail rivals, the Sydney Swans just get SYD. In my historical files I use SME for the South Melbourne era, consistent with North. (If you’re that interested in the flailing fortunes of the University VFL club, they’re just UNI.)

Finally, the two Coast teams; why do I give the West Coast Eagles get WCE and the Gold Coast Suns get GCO? A couple of reasons; West Coast Eagles is the actual name of the club, the Suns are the Gold Coast Football Club. With the other new team, GCS would’ve been a little close to GWS as well. You see GCO, you know what I’m referring to.

NRL

While I’m still developing a system for the Leaguies, I had to start with the abbreviations for those. (Don’t get me started on the jersey banners.)

So, firstly, that Canberra-Canterbury ambiguity. It’s… a pain in the neck. Can’t use truncation, can’t even use the Bankstown part into consideration because they might both be CBA. Eventually I’ve gone with CBR for Canberra (even though there are letters in that order in CanterBuRy too) as the city has done a fair amount of branding using their IATA airport code. For Canterbury, they get CBY – at least there’s no Y in Canberra.

Anyway, that’s the hard part over.

Brisbane get the IATA code BNE. NQL for the North Queensland Cowboys and GCT for the Gold Coast Titans is also clear.

The Senior Men’s XIII of the Cronulla Sutherland District Rugby League Football Club get CRO, which brings to mind cronuts or Crom, the grim-humoured god of Conan’s Hyboria. MANly-Warringah, PARramatta and PENriff get similar treatment as MAN, PAR and PEN. I could possibly do PNR as that scans well too.

St George-Illawarra get SGI; unlike the other double barreled monikers, St George and Illawarra are a merger (or joint venture, as some pretend to call it) so a nod to both is warranted rather than erasing the luckless Steelers with STG.

WTI is probably the best way to render down Wests Tigers; again, another joint venture between Balmain and Western Suburbs (originally referred to Ashfield, not Campbelltown).

NWC for Newcastle and NZW for the New Zealand Warriors separates the two clubs nicely.

The Sydney Roosters get SYR rather than SYD because of that weird period in the ’90s where teams were either calling themselves Sydney or even dropping the locality altogether. Not sure what I’d call them under their true name Easts – maybe ESB for Eastern Suburbs, I think.

Calling back the AFL scheme, I’m going with SSY for Souths, just in case we get the long-lost Crushers back as SQL, as in SQL DROP TEAM Crushers;.

A-League

Fortunately the A-League seem to have canonical TLAs for the team names (although I am not sure whether these are officially demarcated or were determined by the broadcaster) so I just went with those.

 

 

 

The State of Things

OK, so we’re three weeks into the season, and I’ve put the “new model” into play and publishing the probabilities. While I think the site looks a little cluttered now, I’m pretty happy how it’s working out, both on a game-by-game basis and as far as the sim projections are concerned. The margin of error looks good now, but it’s early days, and I’m expecting the clubs will shuffle around a bit to throw that out a bit. (Anyway, stasis is boring.)

If you’re new here and you’re curious about how all this is worked out, read on.

Firstly, the name GRAFT is a sort of a backronym for “G” Rudimentary Arithmetic Form Tracker. That’s all I’m doing – adding, subtracting, with a bit of dividing and multiplying.

For a good while I called it RAFT while it was just about the margins, until I introduced the attack/defense elements so I could also work out expected scores and each team’s attacking tendencies. To reflect the new version, I added the “G” for no reason because I definitely didn’t name it after myself.

The GRAFT algorithm is the engine of the beast. I compare two sides’ ratings, add the “interstate” factor if warranted, and from that I get a margin and a pair of “par” scores, which sets up where each team is supposedly at compared with each other.

After each game I compare the actual results with the par margins and scores and adjust accordingly by a factor of 10% (this actually means the two competing teams will move closer or further apart by 20%, so if they played next week at the same venue the expected margin would be 20% closer to what the actual scoreline was). I’ve messed around with that factor somewhat but 10% seems to get the best results while being flexible enough to account for teams improving or deteriorating over time.

I feel it is an absurdly reductive system in some respects, but it is mine! For what I put into it, its capability of self-adjustment is quite effective. Then again, any power rating system worth its salt is based on self-adjustment to at least some degree.

You should take a grain of that salt with GRAFT, though, because it does not take into account player changes, particularly whether the coach decides to rest half the first-choice side before the finals. The lines are based on three things only: the two clubs, and where they’re playing*. You would still have to do some homework based on the ins-and-outs and other ineffables. Of course there are plenty of other great ratings systems out there so you can sort of take your pick.

* – I’ve modeled against difference in resting days as another possible factor, but it doesn’t seem to matter much. The dead rubber thing is something I haven’t tested but I feel that since it would affect only a small number of games towards the end of the season, it hasn’t really been a priority of mine – I mean, how can you tell the difference between a team that’s deliberately tanking, a team that doesn’t care about winning anymore, and a team that’s trying its best for their fans but are really actually complete garbage.

We Float

For the 2018 season, I’ve made one major change to the GRAFT system this year by expressing it in a 1:1 scale of actual scoreboard points. In previous years it was 10 times the scoreboard.

I made this decision for a couple of reasons:

Firstly, I felt scaling the ratings in this way would be more intuitive.

Secondly, for all of RAFT/GRAFT’s life, the ratings had been expressed as integers – this was practical for most instances but then I’d get to instances where there was 5 points difference in the ratings, would I round up or down? By moving to floating numbers, while the dots look a little more daunting in the tables, it also means I can deal with edge cases with less of them “too close to call”.

Something that came in quite handy last weekend where the working out showed the Swans 0.1 points better than the Giants – with the old integer system I probably would’ve called this too close to call, now in this case while the line is still pretty much a draw, I could at least pick an expected winner, although not with a great deal of confidence. (The fact that Sydney won by 16 points doesn’t really vindicate that at all, because it’s supposed to be close, dammit!)

Despite the change of scale, the GRAFT system is still pretty much the same; it works out expected scores and then nudges them accordingly based on the actual result.

As a rule of thumb, consider teams on around 120 GRAFT points (or +30 over the league mean) as premiership quality and 100 points (or mean+10) probably gets you into the fianls. Teams that go on a bull tear like Essendon 2000 or Geelong 2008 have gotten as high as 150, while in GWS and Gold Coasts first few seasons they would’ve been floundering around 40 to 50. Low scores seem to be more common that similarly high score, but they’re also easier to recover from – it only takes a few good wins or near-misses for a lowly team to get back to the pack.

The Probability Problem

You may be familiar with Elo-based systems – the fun thing about those is that, properly calibrated, they give you the probability of the result straight out of the box. Then you have to figure out the line margin from that.

GRAFT does that arse-backwards. It works out the expected score and margin, from which one has to derive the probabilities.

For the longest time I used a horrible fudge to work out the probability, based on a uniform bell curve, which was not constrained by zero (that is, when I was running the Monte Carlo sims, it could throw out negative scores which in practice I bumped so at least the scores were positive while I preserved the margin).

So it was really quite an awful hack, statistics-wise. It worked OK in practice in some ways but the sliding normal curve just annoyed me and I was not confident about publicising them on the site.

Besides, the normal distribution doesn’t fit (literally, in the statistical sense) The historical score distribution from AFL/VFL games doesn’t conform to the bell curve, it sort of skews, with a bit of a tail towards the high scores – the median score is slightly below the mean.

I won’t get into too much detail here because the Towards A New Model series of articles covered that, but I went with the Gamma distribution because it was a nice fit with the actual result, both overall and under constraints, and with that I could devised a reasonably sane model that I could use to determine probabilities for individual matches, as well as whole seasons using Monte Carlo methods.

It’s worth posting again because I went to a fair amount of trouble working something out:

def curve(rating):
    a = rating * .1 + 3
    return scipy.stats.gamma(a, 0, 7.5)

(This is in Python form, “rating” is the expected score according to GRAFT, and the output is a “frozen” distribution model as implemented in scipy.)

Essentially it doesn’t rule out the possibility of a team not expected to do well suddenly cutting loose, but it acknowledges that it is harder than it is for a team on top of the league.

Passive-Regressive

So now that I am using the Gamma algorithm to calculate means and probabilities, it tends to come across a bit more conservative than the margin lines that GRAFT comes out with – mostly because, overall, the mean of actual results is more conservative. It’s almost as if, as the GRAFT rating is foremost a form tracker that expresses how each team has performed up to that point, when they step onto the field they have to prove themselves worthy of that rating all over again.

When I was doing the prep work over summer, I found that, on averaging across each margin band as determined by GRAFT, the actual margins were about 25% closer to the mean than what GRAFT was expecting. That is, I would check the range of result about 20 points above the global mean, and find the average actual result was more like 15 points – sure, some blowouts for the favourite, and some major upsets against the grain, but overall a little closer to zero than the standard GRAFT had expected.

However, when I decided to tone down GRAFT itself by reducing the weekly adjustment factor, it didn’t actually improve the number of successful tips, and weirdly the actual margin averages were still regressing compared with the expectations.

But decoupling the Gamma line (which took into account that regression) from GRAFT seemed to work much better.

So, the probabilities and lines I am publishing for aggregation through http://graftratings.com/aft/graft_tips.csv are based on the Gamma model, however the ratings that I use to rank and compare the teams are still based on the usual week-by-week GRAFT mode.

By setting a conservative slant on the Gamma model has come in handy for the season sims because, since it effectively imposes a regression to the mean and brings all the teams closer together (awwww) it actually gives more credence to outliers to show up in the mix.

(Regressing ratings to the mean is a pretty common practice in ratings systems, in that it is applied in the off-season before the competition begins afresh. Weirdly, though, I don’t actually do that; For each new season, I reset the seed ratings based on the home-and-away performances from the previous season. This is done by weighting the for and against totals by halving the scores from the matches where the clubs met twice in the season.)

What’s Next

All of this is still a work in progress, for instance, now I’ve gotten to this point, I would like to work out and account co-variance – since once GRAFT emits the two teams’ expected scores for the Gamma curves, those two curves are independent, which is not quite how footy works – like, you can’t have both teams scoring at the same time. I think that will be quite tricky to work out but it’s the next logical step as far as improving the model.

In the meantime, the current version is set and being put into practice for match-by-match “predictions” and season projections. I’m really at the next stage of this, which is devising visualisations that I hope will be informative and intuitive, without being too misleading.

Of course at the moment I have spammed a bunch of numbers under each match info box, but it seems like maybe it’s too much? Anyway. This thing is pretty much a hobby for me (the site is low traffic so my bills are modest) and I get as much fun working out the design part of it and coming up with weird charts as much as the heavy maths.

And just in case you’re wondering, if I’m watching or listening to a game, I don’t even really think about this stuff. I just turn into every other moron yelling baaaaawl (actually I’m probably more “WHAT THE HELL WAS THAT FREE FOR”), have a bit of a laugh if something stupid happens, and then at the end of the game, that’s when I put the numbers in and turn the handle and have a look at what comes out the end.

Towards A New Model – Part 3

In previous posts on this matter, I settled on the gamma distribution as the model, as it was a continuous analogue of the poisson curve, and provided a curve that reflected actual real world scores.

The next little challenge was to calibrate it against the “par” scores that GRAFT generates. The par scores typically range from 50 to 120 and are a little narrower than the distribution of actual scores. There were a few more quirks that came in when I was trying to make it all fit. Basically I took 6-point wide slices of each prediction set, and set them against the actual scores against which I would try and fit the gamma curve. Fortunately, at least for the 60 – 120 par range there was a reasonably consistent look to the curves, as I will show. I confined the set of historical scores to start from 1987 to 2017.

(The black curve is the fit against all scores)

A little bit haywire at the extremities, but not too bad for the main spread. I looked at the factors for the distribution, particularly the shape parameter a, to come up with something plausible.

After a bit of eyeballing a sensible function for a, I’ve set the magic function with 7.5 as the scale

def curve(rating):
    a = rating * .1 + 3
    return scipy.stats.gamma(a, 0, 7.5)

(This is Python code of course, you’ll want the scipy library to play with this. I presume, if you use R you’ll be able to figure out how to implement this.)

So in the end we have this bad boy:

And overlaid on the actual result fits…

So that’s not too bad a fit, at least for our purposes.

A few quirks, though; the mean and median of each curve is actually regressed towards the global mean compared with the GRAFT par that was spat out. I have a feeling that this is counteracted by the spread of scores at each GRAFT tick, where a lot of scores come in slightly under average, but this is balanced by the long tail when teams go on the march and score 20 goals or more.

Well, let’s just that I haven’t put too much rigour into it before this season. There’s still a few things to mess around with, like for instance I basically set up curves for each team’s rating against each other completely independently, so there isn’t any covariance taken into account at this stage. What happens when two defense-orientated teams face off, versus two attacking teams? Well, that’s already kind of accounted for at the par setting stage (team.attack – opposition.defense + league_par).

I’ve gotten as far as cooking up some rough examples for the opener on Thursday night:

From the main site, Richmond’s par is 90.8, Carlton’s par is 58.8.

Based on all of the above, Richmond’s likelihood of winning is 76%, with a mean winning margin of 24. Well, that’s our first bit of weirdness there: GRAFT states that Richmond are better by 32 points, so what’s happening there?

The same sort of thing is happening with the base scores: Again, the idea of GRAFT is that each par score represents what each team should score against each other, given past performances, with the weekly adjustments carried out by a factor, calibrated to maximise the amount of correct tips. It’s a little fuzzier on margins even though margins are at the core of the result.

Richmond’s par (rounded off) is 90.8, but the mean of their curve is 90.6 (OK, not too crazy) and the median is 88.1.

As for Carlton, with a par of 58.8, their mean is 66.6 and their median is 64.1.

We’re fine with the median being a little skewiff, it’s not a symmetrical distribution, and the elongated tail is consistent with real scores. (That’s why I picked the gamma distribution, after all, as it was such a good fit for the historical record)

 

Very early drafts of how the graphs might look on the site. I have coded these at a low level in Pillow because I am a masochist. Also because matplotlib plots like the ones earlier in the post look like matplotlib plots.

So obviously there’s regression to the global mean happening each week. Thing is, the weekly ratings are updated according to weekly results based on the actual scores, based on factors which boil down to *Hulk voice* “big better than small”. While I’ve tweaked how the ratings are presented this year (dropping them by a factor of 10 and making them floating numbers) that part has essentially not changed.

What I was dissatisfied with was how the probabilities of individual games and the season as a whole were calculated. That practice of using a normal distribution which would not tail off nicely at zero so these required a horrible fudge of jacking up both teams’ scores. I am glad to be done with that nonsense.

With this new algorithm, I am at the point where I am happy to publish the probabilities, intending to roll them out by Wednesday (boy this blog entry is going to become dated fast), but at the same time they probably need a little more rigour applied, so caution is advised if you’re going to take heed of them for certain activities. Entertainment purposes only!

Another thing that will happen on the site is that I will be producing CSV files for the tips (with lines and likelihoods) and ratings for people to scrape if they like – a little more detail on the the formats and such in the next post.

Rushed, Behind

It has been a couple of months since the last update, and really there’s not that much to say as far as this website is concerned. As anticipated, I’ve been flat-out with other things, in particular moving house, which is always haunt.

Now I’ve settled in enough that I can actually sit down and get back into this stuff. It’s probably a little late to do a decent run for the AFLW, which is already three weeks into its season, which I think is too short anyway. Yes, I think the skills would get better if they got to play more football. That goes for the actual length of the matches too. Anyway, catching up on that is still a priority.

No such plans for the AFLX circus just past; OK, it was probably worth a try just to see what might happen, but having seen what happened, let’s put it back in the box and never speak of it again. Makes you appreciate defensive pressure and the structuring required to get around that.

There’s also isn’t much time to get the revised GRAFT system into place before the season proper commences, but we’ll see how that goes. Having a live comp in the AFLW gives me a chance to rework the design but I think we’ll be a few weeks into the blokes’ turn before I can drop that new prediction model. I should still have tips for the aggregators from the get-go, and I will provide CSVs to make the scraping a little easier.

In the meantime, I’ve done a few basic cosmetic things as I get my head back into this; probably most noticeable is the Eagles going back to the royal blue for their home guernseys, not such a bad move, so the icons for the new season will be updated to suit.

Towards A New Model – Part 1

The past couple of weeks have seen small but significant steps on the development towards the retooling my sim model for AFL.

The first thing I had to do was update my historical data from, the AFL Tables scrapings.

For that I dragged out my old parsing code which still works, but had to deal with the fact that I had stored the goals/behinds with a dot separator. Which is actually not really a good idea if you’re generating a CSV (comma separated value) file, as if you load those straight into Excel the trailing zero may get stripped, so 10 behinds would be come 1 behind.

It’s OK for my purposes since I do most of my stuff in Python, but I decided that I would make my history file public I should at least eliminate that misunderstanding, and so for those fields I’ve changed the sub-separator to the underscore ( _ ).

After all that cleaning up it’s at a point where I can make the history file public, so you can now pick it up on the new Resources section, which may get expanded with other files that I decide to make public.

With that dataset sorted out, I could get stuck into analysing that.

In previous years I’d used the normal distribution (ye olde bell curve) as the basis for the simulation module. There are a few of problems with that, the most annoying to me is that it would generate negative scores.

Anyway, while I was attempting to work up a plausible sim model for “sokkah”, in that case I reasoned that the poisson distribution was most appropriate there as it was an event-based sport, after all.

AFL scoring, too, is a series of events, but with goals and behinds, the waters get muddied a bit as far as the quantum of the data. I guess I still couldn’t get away from the idea of using a continuous distribution, but for that I decided to use the continuous equivalent of the poisson distribution, the gamma curve.

So, I applied that to the set of final scores in the AFL/VFL set, and it worked marvelously.

So that’s what we’ll be using as a basis. I’ve also gotten the suggestion that the log-normal curve might also be worthy as it exhibits similar characteristics, so that might get a look in as I fine-tune things.

I’m now at the point where I’m trying to calibrate forecast results (based on the GRAFT system) against actual results, and that’s actually not looking so great. As far as margins go, what I’ve found is that while there is a good correlation in one sense (bigger predicted margins match up with bigger actual margins), the average of the actual margins for each slice is about 75-76% of the forecast margin. Not that flash. I can generate a pretty good win probability system out of it, but I also want to nail the “line” margins and par scores as well.

In other words, for games where I have “predicted” one team to win by 50 points, they end up winning (on average) by 38 (mean) or 40 (median) points – albeit with a lot of outliers as you’d expect.

There’s a bit of thinking here to do, and I strongly suspect that it’ll lead to a significantly reworking of the GRAFT system to the point where it’ll have to be considered Version 3 – but what that actually entails is still a bit of a mystery. It may be that this will be a whole new system that moves away from the linear arithmetic model, at least in part.

So that’s where we’re up to at this point. How much of this work I can get done before the new season is a little uncertain, because there’s a few other things on my plate over the next few months. But we’ll see how we go.

Hitting the Track

Starting to look at next year’s fixture. No great insights as yet, just basically cleaning up names of stadia and that sort of thing (search and replace Optus Stadium with Perth Stadium, Goombadome with Kardinia Park, that sorta thing)

Also stripping down the website gen code for next year. I haven’t started on the revamped sim routine as mentioned in the previous post because it’s quite a tricky thing to analyse. Obviously once that’s hammered out I can do those spurious season projections again.

I’m trying out few things with the A-League section. The SVG graph is coming up well, so it’s pretty likely I’ll introduce that sort of thing to the AFL pages next year as well. It should also lend itself to some more interactive features as well.

Speaking of interactivity, another thing that I said I would implement this year – but didn’t – is make various tables sortable. It should be pretty easy to do, I’d just never got around to doing it.

Another off-season move is, well, myself sometime in the new year, which will involves a bit of NBN roulette among other things. Let’s just say I’m not optimistic. But I should be all set up by March so we’ll see how much venting on telcos I ended up doing on the Toots.

Offseason Training

What’s been happening lately at GRAFT?

Of course, I’ve set up the A-League site, and have been incrementally adding new things. Most recently putting together a SVG graph with the weekly fluctuations in the Elo ratings.

As well, I’ve made a some steps towards a decent prediction model. As you’d imagine “football” modelling is very mature and there is a great deal of literature on that, but you know, it doesn’t hurt to devise a system from first principles, even if you find out you’ve only reinvented the wheel at the end of it.

Elo is particularly good at giving you win/loss probabilities out of the box, but of course there’s the whole issue of draws to account for. On the whole draws seem to eventuate 25% of the time, which is a nice round figure.

This is one of the things I hope to deal with as I want to incorporate attacking/defensive ratings into the mix, in an attempt to create more plausible projections.

My theory is that a contest between two teams with attacking tendencies is less likely to result in a draw than between two more defensive teams. The reasoning should be fairly intuitive, if neither club gives a shit about stopping balls flying into nets, in such a shootout it’s less likely that the teams will finish on the same score.

As well, two evenly matched teams would be more likely to play out a draw than in a match where one team is of a much higher quality than the other, even when taking bus parking arrangements into consideration.

Anyway, that’ll take a bit of nutting out, although at the end, I want to be able to show a par score for each team prior to each game (much like I do with the AFL) although it’d look something like MVC 1.2 v SYD 2.1, with maybe the result probabilities alongside. The numbers would align to the poisson distribution of how many goals they might actually score. Once you get to that point, doing Monte Carlo predictions on that basis becomes pretty simple.

Among other moves for next year:

Revamp the site for the AFL/AFLW seasons in the new year. I mean, yes, it worked out for 2017, but I’m kind of bad about leaving things alone. Besides the whole “rating footy teams” part, I’m into this thing as much for developing new visualisations and designs. Basically I’m experimenting on all that in public. Some people have weird hobbies and this is mine.

Something else I will try and do is also provide CSV output of my tips and ratings into a set format so it’s easier for others to scrape and utilise rather than break everything whenever I mess around with the “look and feel”.

I consider the basic GRAFT system to be pretty much settled, for all its faults and limitations. That’s tied up with the basic philosophy that it takes into account only results and venues, however. It is what it is.

Which is not to say that I won’t have new things on the go, including other systems based off it or on completely different principles, but at the end of several years of development and refinement, the core is done. I do want to publish the basic algorithm (it really is absurdly simple) and some associated tools for others to examine and rip to shreds, but I am a horrible coder so there’s a bit of cleaning up to do before that happens.

Having said that, the main thrust of development is the projection system. I intend to overhaul the system that I use to work out my probabilities and eliminate some of the more egregious fudges. I’ll have more detail here during analysis and development, but the first aim is to move on from the “close enough is good enough” normal distribution that I have used up to this point.

As far as AFL projections go, I will probably stick to a continuous distribution. There’s a few that might fit the bill, but that’s yet to be figured out. Maybe gamma, maybe logistic. There’s a lot of number crunching to be done for that but I’m going for something that starts at zero and follows the historical curve, has some amount of covariance between the two teams, and of course it’ll need to correlate nicely with the GRAFT ratings.

Another objective with the AFL section is to flesh out history sections for the league in general and also under each club page. Not quite sure how to present all that, I probably won’t go all AFL Tables on you because, well, we already have AFL Tables, but having the ratings into a historical context would be interesting. Again, that’s a thing that will develop over time.

Aside from all the AFL stuff, there’s also the intention to¬† branch that other winter game – since I do want to have a more general view to the site. There’s a few things to work out how to tackle the NRL ratings, but I think some kind of hybrid approach will be needed there. Historical data is a little harder to find and organise so that’ll be the first thing to sort out. Of course this means I will might actually have to get enthusiastic about League again, which has been a struggle since it tried to go Super.

Of course in taking a more general approach across the sports, I have to sort out this shiny new site so people don’t get lost around here. Getting into the web dev side is pretty interesting; I’m using Bootstrap 4 as the basis for now since it’s reasonably easy to set things up so it doesn’t look too broken in small screens.

Aside from the framework, when it comes to generating the website pages, I’m committed to using all sorts of spaghetti code that I have trouble comprehending when I look it again after a break. Well, as I said, horrible coder. I probably do all sorts of things would make seasoned pros scream. “What’s unit testing, Precious?”

Fortunately it’s not my day gig, and I’m not looking for one.

Anyway, that’s what’s on the whiteboard for the next few months. In the next week or two I will have a poke at the 2018 AFL fixture and see how that stacks up and then announce my usual anodyne opinions about how nothing really matters anyway and thank buggery they haven’t inflicted us with 17/5 just yet. Bet you’re looking forward to that.

Summer Diversions

In past years, I probably would’ve moved my focus to other things not related to sport, such as cursing the sun god that gets angrier and angrier every year, but for some reason I’ve decided to broaden the GRAFT umbrella beyond the AFL into other sports, the A-League to begin with.

I don’t just do this stuff out of my interest in sport, I also like to try out design and visualisation ideas as well. While I’ve carried over some of my personal conventions and ideas from how I present the AFL stuff, it also gives me an opportunity to redesign some things from scratch.

The GRAFT system isn’t as effective in other sports (basketball possibly being an exception) so I’ve naturally returned to the Elo system to measure things. It is essentially unmodified from the implementation used by World Football Elo Ratings. I did a number of runs against the A-League record trying out different settings, but as a starting point I’ve reverted to what are essentially the default Elo values of K-factor of 40, Home Ground Advantage (HGA) of 100 and an off-season regression to the mean of 90%.

I don’t follow any particular club in the league (something about the league being established too late for me to latch onto any one in cultish devotion) so from that point of view my impartiality should be preserved. Worth noting though, that coming into this season Sydney FC enjoyed the highest ongoing rating after last year’s performances, so they started 1709 (+209 against the baseline) and it’ll be interesting to see if the other clubs can peg that back.

As far as the design side of things, I do enjoy making the club banners as a visual aid – I don’t use official logos mostly to evade the copyright hammer, fair use considerations in Australia being so cooked – but with Melbourne City choosing to fully adopt the sky blues of their Citeh masters, I did have to think a little bit about how to differentiate that from Sydney FC. Since City are using the old Heart stripes as an away strip, and Sydney FC are going with navy sleeves this year, I’ve alluded to those cues and that might make enough of a distinction between them on the fixtures and tables.

I intend to add more features such as:

  • draw odds (currently it’s just based on winner-takes-all, which of course is daft)
  • an attack/defense measure (which may involve adding the GRAFT special sauce to work out that vector)
  • simulation projections
  • and of course the doovy visualisations that go with that.
  • club profiles

Keeping to the basics to start off with, but it’s a start.

Changing of the Guard

So, that’s the winter sports all wrapped up for the year – at least in theory.

We saw Richmond triumphant in a game that saw them break down Adelaide’s vaunted attack while keeping up their own end. In particular, Dustin Martin had a fine game to sweep the treble of the Norm Smith, although like many I thought Bachar Houli might been a bit stiff missing out on that as he and Alex Rance did much to blunt the Crows’ path to goal. It was quite telling that when Martin turned up to the post conference he bore only the premiership medallion. Too much bling is gauche.

(As for the conspiracy theory that somehow Adelaide were stiffed by… something to do with umpiring or the higher gravity in Victoria or what, maybe if they hadn’t lost by eight goals, there might be grounds to establish an argument.)

Of course after Footy Christmas came Footy Boxing Day, with the Melbourne Storm just basically crushing the North Qld Cowboys, as seemed to be expected. A bit dull to watch but a nice send off for Cronk, Cam Smith and Slater. Also completing the Swan Street double after the Tigers’ incredible triumph sent that part of Melbourne into raptures – and without any overturned, burned out cars! Good work!

As far as GRAFT goes, I’ve decided to keep myself busy over summer by keeping track of the A-League which kicks off this weekend. The GRAFT system that I use for AFL doesn’t work so well for association football so I’ll be running off everyone’s favourite ranking algorithm as devised by Arpad Elo. I’m unlikely to bring too many enhancements beyond the off-the-shelf formula, although I do want to work out a method to track attacking/defensive alignment along the way. As such, the ratings for this season will be a work in progress as I tune things along the way. So it’ll be in alpha phase for this season.

I also have a bit of homework to do over summer as far as the AFL ratings go, mostly where it comes to projections and that. As I tweeted a while ago, the random algorithm I use for the Monte Carlo sims needs to reflect actual scoring distributions more closely so that’s going on the to-do list.