FOO Was Here

OK, after all the tedious talk about statistical distributions and aligning them to real life, I felt like doing a dumb fun post about the TLAs (three-letter acronyms) that I picked out here, with a bit of historicity. Unlike national ISO or IOC codes, there isn’t really any set schema for how the AFL clubs (or indeed NRL clubs) are labelled in this way. This was bothering me a little bit when I prepared the CSV file for aggregating my tips, since whoever was using them would probably have to do some mapping.

Most media outlets tend to go with abbreviations of two, three or four letters, which for the most part works fine. Things get weirder for the AFL’s official hashtag components, quite often I myself get mixed up between North/Roos or Bombers/Dons and while the AFL would probably make some stupid choices if they moved to the TLA hashtag scheme at least they’d easier to boil down.

But particularly when I started working on this stuff in the early 00s and doing very basic text outputs, the TLAs were where it was at for formatting the tables and whatnot. Consequently, I applied these over to the website when I finally set it up.


Most of the ones I used for the single-word teams are pretty straight forward, – CARlton, COLlingwood, ESSendon, etc.

With the teams I’ve made choices that differ from other outlets, so in this case it’s probably worth an explanation.

BRL obviously stands for Brisbane Lions, to differ them from the Be Right Backs, I mean, the Brisbane Bears. Keeping the L for the Lions is also a bit of a nod to Old Fitzroy, if the business name of the club, the Brisbane Bears Fitzroy Football Club Ltd wasn’t enough. (With respect to the grassroots VAFA club who can be considered the true Fitzroy Football Club these days).

STK for St Kilda, NME for North Melbourne and PAD for Port Adelaide make sense for easy recognition although I note other orgs use other distillations.

WBD is not my preferred abbreviation for the Western Bulldogs, because I would rather they went back to FOOtscray (no, the VFL side doesn’t count) and truly recapture the dignity of the heart of Melbourne’s working-class, multicultural west. Sorry, over 20 years and even a flag later I’m not a fan of the marketing term. Still, the Westerns are what they’re branding themselves as for now, so we’ll roll with it.

For the other Western club, GWS is pretty much a no-brainer of course, although it was interesting to hear some talk about them just going with the Giants branding. GIA would be the most likely option for me. Also probably just as well the archaic Cumberland name wasn’t used because oh dear god.

Their silvertail rivals, the Sydney Swans just get SYD. In my historical files I use SME for the South Melbourne era, consistent with North. (If you’re that interested in the flailing fortunes of the University VFL club, they’re just UNI.)

Finally, the two Coast teams; why do I give the West Coast Eagles get WCE and the Gold Coast Suns get GCO? A couple of reasons; West Coast Eagles is the actual name of the club, the Suns are the Gold Coast Football Club. With the other new team, GCS would’ve been a little close to GWS as well. You see GCO, you know what I’m referring to.


While I’m still developing a system for the Leaguies, I had to start with the abbreviations for those. (Don’t get me started on the jersey banners.)

So, firstly, that Canberra-Canterbury ambiguity. It’s… a pain in the neck. Can’t use truncation, can’t even use the Bankstown part into consideration because they might both be CBA. Eventually I’ve gone with CBR for Canberra (even though there are letters in that order in CanterBuRy too) as the city has done a fair amount of branding using their IATA airport code. For Canterbury, they get CBY – at least there’s no Y in Canberra.

Anyway, that’s the hard part over.

Brisbane get the IATA code BNE. NQL for the North Queensland Cowboys and GCT for the Gold Coast Titans is also clear.

The Senior Men’s XIII of the Cronulla Sutherland District Rugby League Football Club get CRO, which brings to mind cronuts or Crom, the grim-humoured god of Conan’s Hyboria. MANly-Warringah, PARramatta and PENriff get similar treatment as MAN, PAR and PEN. I could possibly do PNR as that scans well too.

St George-Illawarra get SGI; unlike the other double barreled monikers, St George and Illawarra are a merger (or joint venture, as some pretend to call it) so a nod to both is warranted rather than erasing the luckless Steelers with STG.

WTI is probably the best way to render down Wests Tigers; again, another joint venture between Balmain and Western Suburbs (originally referred to Ashfield, not Campbelltown).

NWC for Newcastle and NZW for the New Zealand Warriors separates the two clubs nicely.

The Sydney Roosters get SYR rather than SYD because of that weird period in the ’90s where teams were either calling themselves Sydney or even dropping the locality altogether. Not sure what I’d call them under their true name Easts – maybe ESB for Eastern Suburbs, I think.

Calling back the AFL scheme, I’m going with SSY for Souths, just in case we get the long-lost Crushers back as SQL, as in SQL DROP TEAM Crushers;.


Fortunately the A-League seem to have canonical TLAs for the team names (although I am not sure whether these are officially demarcated or were determined by the broadcaster) so I just went with those.




The State of Things

OK, so we’re three weeks into the season, and I’ve put the “new model” into play and publishing the probabilities. While I think the site looks a little cluttered now, I’m pretty happy how it’s working out, both on a game-by-game basis and as far as the sim projections are concerned. The margin of error looks good now, but it’s early days, and I’m expecting the clubs will shuffle around a bit to throw that out a bit. (Anyway, stasis is boring.)

If you’re new here and you’re curious about how all this is worked out, read on.

Firstly, the name GRAFT is a sort of a backronym for “G” Rudimentary Arithmetic Form Tracker. That’s all I’m doing – adding, subtracting, with a bit of dividing and multiplying.

For a good while I called it RAFT while it was just about the margins, until I introduced the attack/defense elements so I could also work out expected scores and each team’s attacking tendencies. To reflect the new version, I added the “G” for no reason because I definitely didn’t name it after myself.

The GRAFT algorithm is the engine of the beast. I compare two sides’ ratings, add the “interstate” factor if warranted, and from that I get a margin and a pair of “par” scores, which sets up where each team is supposedly at compared with each other.

After each game I compare the actual results with the par margins and scores and adjust accordingly by a factor of 10% (this actually means the two competing teams will move closer or further apart by 20%, so if they played next week at the same venue the expected margin would be 20% closer to what the actual scoreline was). I’ve messed around with that factor somewhat but 10% seems to get the best results while being flexible enough to account for teams improving or deteriorating over time.

I feel it is an absurdly reductive system in some respects, but it is mine! For what I put into it, its capability of self-adjustment is quite effective. Then again, any power rating system worth its salt is based on self-adjustment to at least some degree.

You should take a grain of that salt with GRAFT, though, because it does not take into account player changes, particularly whether the coach decides to rest half the first-choice side before the finals. The lines are based on three things only: the two clubs, and where they’re playing*. You would still have to do some homework based on the ins-and-outs and other ineffables. Of course there are plenty of other great ratings systems out there so you can sort of take your pick.

* – I’ve modeled against difference in resting days as another possible factor, but it doesn’t seem to matter much. The dead rubber thing is something I haven’t tested but I feel that since it would affect only a small number of games towards the end of the season, it hasn’t really been a priority of mine – I mean, how can you tell the difference between a team that’s deliberately tanking, a team that doesn’t care about winning anymore, and a team that’s trying its best for their fans but are really actually complete garbage.

We Float

For the 2018 season, I’ve made one major change to the GRAFT system this year by expressing it in a 1:1 scale of actual scoreboard points. In previous years it was 10 times the scoreboard.

I made this decision for a couple of reasons:

Firstly, I felt scaling the ratings in this way would be more intuitive.

Secondly, for all of RAFT/GRAFT’s life, the ratings had been expressed as integers – this was practical for most instances but then I’d get to instances where there was 5 points difference in the ratings, would I round up or down? By moving to floating numbers, while the dots look a little more daunting in the tables, it also means I can deal with edge cases with less of them “too close to call”.

Something that came in quite handy last weekend where the working out showed the Swans 0.1 points better than the Giants – with the old integer system I probably would’ve called this too close to call, now in this case while the line is still pretty much a draw, I could at least pick an expected winner, although not with a great deal of confidence. (The fact that Sydney won by 16 points doesn’t really vindicate that at all, because it’s supposed to be close, dammit!)

Despite the change of scale, the GRAFT system is still pretty much the same; it works out expected scores and then nudges them accordingly based on the actual result.

As a rule of thumb, consider teams on around 120 GRAFT points (or +30 over the league mean) as premiership quality and 100 points (or mean+10) probably gets you into the fianls. Teams that go on a bull tear like Essendon 2000 or Geelong 2008 have gotten as high as 150, while in GWS and Gold Coasts first few seasons they would’ve been floundering around 40 to 50. Low scores seem to be more common that similarly high score, but they’re also easier to recover from – it only takes a few good wins or near-misses for a lowly team to get back to the pack.

The Probability Problem

You may be familiar with Elo-based systems – the fun thing about those is that, properly calibrated, they give you the probability of the result straight out of the box. Then you have to figure out the line margin from that.

GRAFT does that arse-backwards. It works out the expected score and margin, from which one has to derive the probabilities.

For the longest time I used a horrible fudge to work out the probability, based on a uniform bell curve, which was not constrained by zero (that is, when I was running the Monte Carlo sims, it could throw out negative scores which in practice I bumped so at least the scores were positive while I preserved the margin).

So it was really quite an awful hack, statistics-wise. It worked OK in practice in some ways but the sliding normal curve just annoyed me and I was not confident about publicising them on the site.

Besides, the normal distribution doesn’t fit (literally, in the statistical sense) The historical score distribution from AFL/VFL games doesn’t conform to the bell curve, it sort of skews, with a bit of a tail towards the high scores – the median score is slightly below the mean.

I won’t get into too much detail here because the Towards A New Model series of articles covered that, but I went with the Gamma distribution because it was a nice fit with the actual result, both overall and under constraints, and with that I could devised a reasonably sane model that I could use to determine probabilities for individual matches, as well as whole seasons using Monte Carlo methods.

It’s worth posting again because I went to a fair amount of trouble working something out:

def curve(rating):
    a = rating * .1 + 3
    return scipy.stats.gamma(a, 0, 7.5)

(This is in Python form, “rating” is the expected score according to GRAFT, and the output is a “frozen” distribution model as implemented in scipy.)

Essentially it doesn’t rule out the possibility of a team not expected to do well suddenly cutting loose, but it acknowledges that it is harder than it is for a team on top of the league.


So now that I am using the Gamma algorithm to calculate means and probabilities, it tends to come across a bit more conservative than the margin lines that GRAFT comes out with – mostly because, overall, the mean of actual results is more conservative. It’s almost as if, as the GRAFT rating is foremost a form tracker that expresses how each team has performed up to that point, when they step onto the field they have to prove themselves worthy of that rating all over again.

When I was doing the prep work over summer, I found that, on averaging across each margin band as determined by GRAFT, the actual margins were about 25% closer to the mean than what GRAFT was expecting. That is, I would check the range of result about 20 points above the global mean, and find the average actual result was more like 15 points – sure, some blowouts for the favourite, and some major upsets against the grain, but overall a little closer to zero than the standard GRAFT had expected.

However, when I decided to tone down GRAFT itself by reducing the weekly adjustment factor, it didn’t actually improve the number of successful tips, and weirdly the actual margin averages were still regressing compared with the expectations.

But decoupling the Gamma line (which took into account that regression) from GRAFT seemed to work much better.

So, the probabilities and lines I am publishing for aggregation through are based on the Gamma model, however the ratings that I use to rank and compare the teams are still based on the usual week-by-week GRAFT mode.

By setting a conservative slant on the Gamma model has come in handy for the season sims because, since it effectively imposes a regression to the mean and brings all the teams closer together (awwww) it actually gives more credence to outliers to show up in the mix.

(Regressing ratings to the mean is a pretty common practice in ratings systems, in that it is applied in the off-season before the competition begins afresh. Weirdly, though, I don’t actually do that; For each new season, I reset the seed ratings based on the home-and-away performances from the previous season. This is done by weighting the for and against totals by halving the scores from the matches where the clubs met twice in the season.)

What’s Next

All of this is still a work in progress, for instance, now I’ve gotten to this point, I would like to work out and account co-variance – since once GRAFT emits the two teams’ expected scores for the Gamma curves, those two curves are independent, which is not quite how footy works – like, you can’t have both teams scoring at the same time. I think that will be quite tricky to work out but it’s the next logical step as far as improving the model.

In the meantime, the current version is set and being put into practice for match-by-match “predictions” and season projections. I’m really at the next stage of this, which is devising visualisations that I hope will be informative and intuitive, without being too misleading.

Of course at the moment I have spammed a bunch of numbers under each match info box, but it seems like maybe it’s too much? Anyway. This thing is pretty much a hobby for me (the site is low traffic so my bills are modest) and I get as much fun working out the design part of it and coming up with weird charts as much as the heavy maths.

And just in case you’re wondering, if I’m watching or listening to a game, I don’t even really think about this stuff. I just turn into every other moron yelling baaaaawl (actually I’m probably more “WHAT THE HELL WAS THAT FREE FOR”), have a bit of a laugh if something stupid happens, and then at the end of the game, that’s when I put the numbers in and turn the handle and have a look at what comes out the end.