How the Vodacom StatsXV was selected

The Vodacom Stats XV has a few tongues wagging this week, so Tank Lanning, in his Sport24 column, asked to consult on the algorithm used to select it, talks us through the numbers.

Tank Lanning

It was fun to see this week’s Vodacom Stats XV revealed to the public for the first time this year. Fun, because nothing gets the tongues wagging like a “Best Of” type team – just ask the selectors of the so called Varsity Cup Dream Team who took on the Baby Boks this week – and fun, also, given that I was asked to consult on the make-up of the stats algorithm used to put the team together.

This is the team that the algorithm spat out this week:

15. Israel Folau (Waratahs), 14. Francois Hougaard (Bulls), 13. Malakai Fekitoa (Highlanders), 12. Ma’a Nonu (Hurricanes), 11. James O’Connor (Reds), 10. Beauden Barrett (Hurricanes), 9. T.J. Perenara (Hurricanes), 8. Ben McCalman (Force), 7. Boom Prinsloo (Cheetahs), 6. Deon Stegmann (Bulls), 5. Francois Uys (Cheetahs), 4. Eben Etzebeth (Stormers), 3. Josh Hohneck (Highlanders), 2. Liam Coltman (Highlanders), 1. Kane Hames (Highlanders).

Sure, like with every all-star type team, there are a few selections you might question, but it’s not a team that looks like it was selected by a purple thong wearing Martian who has been mainlining chlorine for the past six years.

The three biggies for me are the lack of Tahs players given their excellent win over the log leading Canes, the two Cheetahs players given their loss to the bottom feeding Reds, and the selection of the Highlanders front row en masse after they held on to beat fellow colourful bottom feeders, the Blues.

But much like we want from the referees and officials today, this team was picked sans emotion, interpretation or influence via pre-game banter in the media. Not because the selector has ice in his veins, but because it was picked by a machine using only the post-match stats. And as they say in the classics, the stats don’t lie!

What the stats do not do of course, given the complete objectivity, is make a call on weather a missed tackle on your tryline is worse than one missed in the opposition 22. Or take into account your fellow centre having bolted out the line to leave a yawning gap. And given the plethora of stats, they can be manipulated to help bolster most gut feels.

So I thought I would take the opportunity to give some insight into the algorithm we came up with to select this weekly Vodacom Stats XV.

Using the stats categories on the Vodacom Rugby App, we first divided them into positive and negative reflections per position. This being fairly obvious in that clean breaks are positive and missed tackles are negative.

The tricky part was then assigning a weighting to each category based on their position. So of the fifteen positive stats available, which of them is more important to each position? Which of “Clean breaks”, “Defenders beaten”, “Meters run” or “Offloads” is more important when it comes to judging a full back? As it turned out, we gave “Clean breaks” a top weighting of 15%, “Defenders beaten” a weighting of 13% and the other two came in at 12%.

We then made the call to base the scores not on the actual stats for each player, but how they ranked in each category in comparison to every other player in the tournament playing in that position. This to guard against freak performances in crazy 77-58 type games that do tend to occur in this wild tournament.

It took hours and hours of deliberation, constantly running each iteration over the 2014 stats to see what an overall 2014 “Dream Team” would look like using our algorithm. Eventually we settled on something we like.

It might not be perfect, and we are happy to see the algorithm evolve, but it is one way of bringing the stats to life. If keen to get more insight into the stats used to select each position, I would be happy to do so in next week’s column. Don’t all shout at once now!