Cheltenham trends: are they really useful?

Horse Racing

Cheltenham - the start at last year's Supreme Novices' Hurdle

The publication of the Weatherby’s Cheltenham Festival Betting Guide, alongside stats and trends sites such as Gaultstats, puts the emphasis on finding winners at the Cheltenham Festival using trends analysis rather than getting bogged down in the form book.

I’m something of a traditionalist in that I think the most important pointers to how this generation of Cheltenham hopefuls will run is contained in the previous performances of those horses. That is not to say that Abba didn’t have a point when singing “the history book on the shelf/is always repeating itself” – understanding how previous expectations were met or dashed gives a greater degree of understanding than simply relying on official ratings to produce the winners for us.

Any stat needs to be robust

Let’s look at a few factors, to see how the stats can be helpful, and perhaps where they are just as likely to mislead. There is plenty to be seen looking at just 10 years’ worth of data, but a black-and-white, winners-and-losers approach means a limited sample size.

As such, this often misses important info, while it’s also true that non-handicap races are more likely to be won by the best performers (where best implies both raw ability and physical readiness for the task at hand), so those trying to find value need to mine a little deeper to find those horses which aren’t simply likely to win as reflected in the betting market, but are going to run up to, or exceed expectations.

That means widening the parameters of your study to ensure that any trend which is being bandied about is as robust as its win statistics make it look.

Impact value, A/E and PRB

There are many ways of doing this, and I don’t intend to be fully prescriptive as not everyone has the same resources to hand. But it’s easy enough to spot some stats which are clearly shaky, and others which are perfectly robust, but which may be fully factored already. There are measures like impact value, A/E, and PRB which are very helpful in identifying the real historical value of the figures. If you can utilise them, you ought to.

Best bookie prices. No restrictions. Come over to BetConnect now for 30 days’ commission-free betting.

Impact value tells us how actual wins compare to wins apportioned randomly, the middle measure shows how actual wins compare with wins as predicted by the betting market, and PRB simply giving the % of rivals beaten, which is a quick way of viewing how well runners in a particular data set perform against all rivals rather than just whether or not that data set is producing winners.

If you are more restricted in the information at hand, there are two slightly simpler ways of establishing whether a stat is robust. Are the win stats backed up by similar results for placed horses, or is a particular trend, established over 10 years, also reflected in previous results – say over a 20-year period? If not, then they may be anomalous, and less likely to be repeated this year.

The Willie v Nicky Conundrum

Some stats are very compelling on the surface, but really tell us nothing we don’t know. For example, of the 30 places (including 1^st) available in the past 10 runnings of the Supreme Novices’ Hurdle, 16 of those have been filled by horses trained by either Willie Mullins or Nicky Henderson. If hearing that surprises you, then you may have some catching up to do.

The fact that Willie and Nicky dominate the opening race of the Cheltenham Festival is to some degree a self-fulfilling prophecy, and they also dominate by number of runners. A look at the win figures for the last two decades suggests that you should avoid Nicky Henderson in the contest, as his stats show he has had just one winner from 30 runners in that time. Dig deeper, however, and you’ll find that he beats Mullins in terms of both PRB (67.5 vs 56.0) and combined win & place strike rate (43.3% vs 22.2%). Winning is the name of the game, and that cannot be denied, but the wider figures show that the big stables are more closely matched than the win tally suggests.

Existing Pro account-holders can bet on horse-racing commission-free for all four days of the Cheltenham Festival. Markets open at 9am sharp every day and Punters are waiting to lay you!

Put simply, the more a trend produces a result which surprises us, the more interesting it could be in developing value, but similarly, by dint of being surprising, these are the trends which we should be most sceptical of until digging deeper to establish their true value.

Sires and small samples

Every now and then, there is a theory which gains traction that the progeny of Sire X “can’t get up the Cheltenham hill”. This will be based on 30 or 40 consecutive losers for that stallion’s stock, and such a losing run is deemed significant. It’s not, and when Sire X produces two winners on the same day, the theory will jump to another stallion.

Mares in mixed sex races at the Cheltenham Festival since 2008

? https://t.co/zTCtXhhFDz
— Tom Wilson (@TomWilsonHorses) February 23, 2020

If you want to get an edge using sire stats, you’ll find Southwell more to your liking, and the huge number of stallions represented at Cheltenham means that individual strike-rates are either low or unreliable due to limited data, leading to cockamamie theories which don’t hold water. Sure, Oscar is odds-on to produce another winner at the Festival, but backers of Paisley Park won’t be relying on his sire’s record when lumping on.

The temptation of retrofitting

Retrofitting trends into profitable systems is the devil’s own work. It works fairly simply in that you analyse a large sample of info and find that backing one subset within the group blind has produced a profit.

There are lots of examples of this depending on how you want to slice the data. For example, last-time-out winners in handicaps, or horses rated over 150 in novice hurdles, or runners wearing headgear in certain races. You stumble on a system which produces a respectable profit, but then you try to remove losers from the system by adding in extra filters, so that you have a higher strike-rate and a more profitable system.

You won’t see many better rides than this – Richard Johnson on Anzum in the 1999 Stayers Hurdle ??pic.twitter.com/sNqxCGn92K
— Racing Tales (@Racing_Tales) February 23, 2020

Sometimes this is sensible, and filters which remove obvious no-hopers are justifiable, assuming you’d happily remove that 66/1 winner which makes another system look so good. The real problem with this is that once you’ve found the angle, you can see which factors are holding the profit down fairly easily, and it’s tempting to use hindsight to massage the figures while pretending that tweaks to the system are purely logical.

Beware the false flag – a classic example

Just because a sequence of winning results show some commonality, it should not be assumed that there is a continuing trend which is responsible for those results. The best example of a false flag like this that I can remember involves a bloke who peddled a system which promised to find the winner of the Grand National, and had predicted seven of the past 10 winners from a small sample. It transpired that the fundamental plank of this brilliant system – and to be fair, it had produced a handsome profit over a decade – was the common link between such diverse winners as Miinnehoma, Lord Gyllene, Earth Summit, Bobbyjo, Papillon, Bindaree and Monty’s Pass. They all have a double letter in their names.

So, the final word on this subject is to broaden the scope of any analysis to provide a clearer picture, be wary of small sample sizes producing the answers you want to hear, and don’t allow yourself to be fooled by coincidence.

Horse Racing