The Black Orifice

Tabletop gaming resources and events from grumpy old games designers, Ben Redmond and Nigel McClelland

The Guild Ball Rankings and What They Can Tell You

The Guild Ball rankings system I create has been up and running for a while now, and with the introduction of a new feature that lets you filter the guild summary table by player win ratio, I thought I might take the opportunity to put together a blog post about it all to provide some insight into what this data means.

How to Manipulate the Table

The table on the right hand side of the rankings page shows a summary of all the results recorded in the database to give an impression of how successful the different guilds are under competitive tournament play.

there are a number of features available that allow you to limit the data used to form this table.

  • Firstly, you can change the range of dates that the table uses in its data analysis. This feature allows you to input significant dates that might induce a change in the tournament “metagame”, such as an errata or release of a new captain model.
  • Secondly, you can restrict the data by the win percentage for players, allowing you to focus just on the top players, or bottom or middle. By win percentage, I mean the number of games each player has won out of the total number of games they have played, within the range of dates you are searching within. This can let you see which guilds are best for different types of players – if you’re hopeless, what’s the best team for you? If you think you’re a top player, which are the teams other top players are playing so you know what you might come up against.

I would recommend, for any other data-nerds out there like myself, that you have a play and see what you can come up with and what changes it makes.

How significant are the differences between the teams?

I have carried out some basic statistical analysis to try and determine whether or not the data we’re looking at can be seen as “significant”. Firstly let me start here with a disclaimer: I am not a statistician. I’m not even much of a mathematician (grade C at A Level, 20-odd years ago). I have since taught Maths, under protest, to Year 7 pupils, but don’t really have the deep sort of understanding of how these tests work. But I have a family member who is, and have leaned heavily on them to try and work out what this data is telling us and how we can interpret it. Perhaps someone with a more thorough grasp of the mathematics can get let me know whether what I’ve done makes any sense.

Under advice, I have used a Kuskal-Wallis test. This is a test designed to show whether the pattern of data made up from ranked data (which tournament results are) that can be grouped into 3 or more categories (we have 8 guilds, currently), is significant. By which we mean what is the chance that this pattern of data came about by fluke chance rather than by virtue of the factors affecting these results (such as the strength of the guild, the skill of the players using them, the matchups within the events, etc).

What has been surprising as I have been completing these tests is that every test I have done shows an incredibly strong significance. For every test, there has been a less than 0.01% chance that the results were due to random chance. Even results from the second errata only a month ago show as significant, even when also restricted by player win percentage. This makes me nervous. It makes me think I’ve done something wrong, but for now I’m going to go with it and assume it’s good and telling us that the pattern in the data holds true.

However, what is also important to note is that this is telling us that the pattern of data is significant, not necessarily that once guild is stronger than another: if the data shows little difference between the guilds, it shows that it reliable to say that the guilds are very similar in power level.

What Dates Make a Difference

At the time of writing this we’ve only had 4 captains released, and only 2 errata changes. The data doesn’t record which captain each team is using, so it’s difficult to see how much of an effect the captains are having on the results. Errata dates, on the other hand, are clear and easily defined dates, and because we’re looking at tournament data, we know that the errata changes will have been used in the games we’re looking at.

Errata 2.0: 20th October 2015

fig1: Errata 2.0 comparisonAn interesting analysis can be done looking at the data before and after this date to see if this massive errata had a notable impact on the tournament meta. Looking at the data here we can see how stood out as the strongest team, but that also there are notable gaps between each team in the spread of data, with Butchers (perhaps a simpler team to play) at the top and Morticians and Alchemists (perhaps the hardest teams to get your head around) at the bottom. After the errata, though, the picture shift quite dramatically. The teams shift much closer together. The Union still remain on top as a significant power team, but, excluding the engineers, the rest of the teams come to within a 28 ranking points and within 4 percentage points in wins. The engineers were the team worst hit by the errata, dropping down to the bottom of the table from 3rd place!

It is also worth noting that this period also saw the introduction of new players. It is difficult to tell how much of an impact these players directly had on the results shown here, but (anecdotally speaking) two o the most common Season 2 players I have encountered are Sakana and Mash, and so seeing those players entering the meta may explain the boost for Fishermen and Brewers.

Errata 2.1: 18th February 2016

fig2: errata 2.1 comparisonWe can perform a similar analysis on the data for the 2.1 errata, but when we do so, we also need to consider that this period in the meta has also seen the introduction of 4 new captains, so some of the shifting effects shown could be a result of the new captains rather than the errata. However, looking at the data we can again see the data in the middle of the table is very tightly packed. I tested the significance of the results if ordered by win percentage as well as by ranking points and it was significant whichever way you order it, suggesting that the significant factor here is how close these teams are. Probably most significant is the increase for the Engineers, bringing them back into the middle of the pack – Pin Vice presumably the significant factor here. Union remain dominant, but they have moved closer to the main pack.

What is unexpected in this comparison, however, is that the Morticians appear to have, anomalously, slipped down to the bottom of the table, and outside of the middle grouping we’re seeing with everyone else. This leads on to a point I want to discuss as an additional deciding factor – player skill. One thing that might explain the drop in the Morticians during this period is the number of games played by the top 2 ranked players in this time period, Steve Newton and Jonnie Cannon. Both players have been prolific Morticians players and have earned notable more ranking points than the third and fourth placed Morticians players. Could the drop in the morticians form be that these two have only played one event between them during this period?

Do Players make a difference?

A common argument when these rankings are discussed is whether or not the top players make a difference. Are the Union at the top of the table because they’re a stronger guild, or is it because they are most often played by the stronger players? Given the number of games played by players in the top ten (Chris Hay, Jack Newton, Rob Smith, Greg Day and Ben Allen) it’s certainly clear to see that the Union are popular with the top players, but does the strength of the Union carry through to the other players, or are the results being skewed by the number of top players using them? It’s a common argument amongst discussion on this topic, and it’s one that, in the past, I’ve suggested is probably the case without having fully investigated it. Now I have the tools, it’s probably a good time to test this hypothesis.

It was to answer questions like these that I added the player wins percentage feature. It enables you to look at the data by filtering out players by, essentially, how good they are. Or at least how good they have performed in the tournaments they have attended.

The Middle Third

Linking on from the previous section, I wanted to check what the impact is if we take out the top players. There are much more things we can do with this if we want, and I’ll be exploring some of those later. The method I have chosen for looking at this data is to explore players with win percentages between 33% and 67% - the middle third. I like this range because it clusters around the same win percentage and average rankings score values as the whole table. It also removes both the very top and very bottom players so that you can get an idea of whether these players are making a difference as implied by some of the discussion. One thing to bear in mind, though, is that just by nature of the reduced range of data being used, the differences between the guilds will naturally shrink, so we perhaps need to accept a tighter range tan we have done previously when looking at this data.

fig3: the middle thirdLooking at this date, we can see, initially that the data all seems pretty tightly packed together, especially under win percentage. There’s a bit more of a spread with average win percentage, but there’s still only a 12% difference between Fishermen and Morticians, but only a 6% difference between Brewers and Morticians. Basically, everyone apart from the union are tightly packed. The difference between ranking points and win percentage, particularly for the fishermen, might be explained in how tie breaks work under the OP pack - essentially based on how many VPs you score in the games you lose – which arguably favours teams that are easy to score some VPs with early at some point in the game, such as Fishermen who can often manufacture a goal from any Shark activation. I would interpret this to show that between Fishermen and Morticians, there’s very little difference to talk about.

However, the Union are still notably strong, even with the top players excluded. It appears my initial hypothesis was incorrect, and it the Union is still a slightly stronger team than the rest. Whether this will even out with the introduction of the new captains we will have to wait and see. But, for those of you complaining about the Chain Grab and Blind nerfs (not mentioning any Bill Andersons), I’m sorry, but Union were strong and some sort of nerf was indeed required.

fig4: middle third and erratasWe can also look at this middle third to see how the erratas have affected the rankings of the guilds. This creates an interesting, if somewhat muddled picture at this point. It perhaps offers more questions than answers. Why have the Fishermen jumped up to the top after errata 2.1 – is Corsair that good? Why have the Butchers dropped with each – is it just the people are figuring out how to beat them? Why do the masons seem to have separated themselves from the pack – are people only just now waking up to how to use Chisel? I think this is certainly something to watch, to see how the data progresses with the introduction of the Hunters and when there’s a full set of Season 2 captains available. The one question I think it does answer, however, is the one that I posed when I started this part of the discussion – is the Morticians dip after errata 2.1 just a “Steve Newton Effect”? It appears the answer is “Yes”. In this middle third group, Morticians are not the bottom team and are in the main cluster from Engineers in fourth down to Butchers in eighth.

Upper and Lower Thirds

fig5: lower and upper thirdsLooking at the top and bottom thirds is also useful, as it can give us some idea of what the best teams are for beginners and what teams you need to be thinking about facing if you’re wanting to challenge for the top spots at events, or which teams might give the best players the strongest options. I’ve also restricted this data to after errata 2.0 to give perhaps a better picture of the current meta.

When looking at the lower third, I think it is better to look at the win percentage than the ranking points score. The latter likely reflects players doing badly at bigger events, and if you’re the sort of player who struggles to get wins, you want to know which teams are more likely to get you the wins than which teams will see you placed higher amongst the players who are getting 1 or 2 wins. The data would therefore suggest that Butchers, Engineers and the Union are perhaps the best choices, with Fishermen requiring a more delicate touch.

For player’s wanting to aim high, you should certainly expect to face some Union teams on the way. The Union is by far the most popular team amongst the top third players. Butchers and Engineers are probably worth avoiding – which is particularly interesting considering that these were the two teams highlighted as being best for the lower third players. Alchemists are in a funny position given that they are placed sixth in the table on average ranking points, yet joint second highest on win percentage. That leads me to believe that they must have been successful at smaller events, so they are perhaps a good team if you have a smaller local meta, but whether they will do as well if you have the sort of meta that supports regular 32+ player events may be more questionable.

Some Conclusions

I’m at the end of the investigations I have done and feel I need to draw some final conclusions to leave you with, rather than just leave it hanging. Hopefully I have shown, mainly, how well balanced the game is. Not only that, but when we look at the impact of each errata, and the different releases, to see how with each step Steamforged are making the game even more balanced. There are teams that are better for beginners, and teams that reward a high skill level of play, but overall the game has to be the most balance miniatures game I have experienced, and the data backs that up.