TrueNicks community user russf posted a comment under the recent blog post Updated Cross Notations for TrueNicks which we thought warranted a closer look at the logic flow behind the TrueNicks algorithm.
russf commented:
Congrats on these updated cross notations, this is helpful. But please help me understand how the algorithm decides which stallion to use.
For example, I’m not sure why the winner of this year’s Arlington Million gets a TrueNicks score of A and not C. Debussy’s TrueNicks score of A (click to view report) uses Sharpen Up and his sons and grandsons over Sadler's Wells and his sons and grandsons. Why wouldn’t the model use Diesis instead of Sharpen Up from the top line? That pairing gets a TrueNicks score of C. Apparently, the Debussy algorithm passed by Diesis and went to Sharpen Up. Yet there seems to be a sufficient number of stakes winners pairing Diesis and his sons and grandsons over Sadler's Wells and his sons and grandsons. To test that I ran a TrueNicks score for the hypothetical pedigree of Sweet Return (GB) (TrueNicks,SRO) over Opera Comique, dam of Debussy. That score was C (click to view report), using Diesis over Sadler's Wells. I realize that the TrueNicks score for SR/OC couldn’t use Sharpen Up – he’s beyond the grandsire of the sire - but if Diesis is used for Sweet Return over Opera Comique, why couldn’t it be used for Debussy?
Byron Rogers, partner of Pedigree Consultants (Visit Site) and co-developer of TrueNicks, replies:
russf, thanks for the comment, you raise an interesting example of the logic flow of TrueNicks.
Debussy is a relatively rare example of a situation where you have an older sire and a very young broodmare sire resulting in the calculation having to gather a large number of examples on the broodmare sire line to create a rating. In the original logic flow behind TrueNicks, well before it was released to the public, we did in fact have the option of rating "the sire and his sons" with mares by "the grandsire of the broodmare sire, his sons and grandsons" but there were two factors which prompted its elimination from the logic flow that you see in use today.
Firstly, especially in the case of older stallions, we wanted to maintain the primacy of the sire. In the case of an older stallion like Diesis, where there was enough data to consider the sire inclusive of the data set, we didn’t think it best to have the pedigree move "forward" of the sire in question unnecessarily. Keeping the data set as relevant to the sire in question as possible was a primary concern of ours. So in this case of Debussy, we viewed "Sharpen Up and his Sons", which includes all data created by Diesis himself, more relevant than "Diesis and his sons". Both sets of data would include data for Diesis himself but we found it to be slightly more relevant for a stallion if the calculation was made on his own sire and his sons as opposed to him and his sons, with the latter having more potential for variability especially in the early years where the first sons of a sire tend to visit a wider variety of broodmare sires than their own sire at the same time.
Secondly, getting back to my original statement, we did actually have this line ("Diesis and his sons") in the original TrueNicks mathematical logic. There were in fact 28 logic flow rules in the original TrueNicks logic. However, when we conducted the two split tests of 50,000 horses each, this rule (and a number of others) actually had a very, very small set of the population calculated on it. It was only actually opportunities like Debussy above where older stallions were hitting young broodmare sires. The main protagonist in this situation was actually Cozzene, who for some reason in his latter years served a lot of mares by young, unproven broodmare sires. After going back through the results we pared the number of logic flow rules down from 28 to 16, eliminating many of the rules that didn’t have significant calculations made on them like this one. When we re-tested the results, on horses like Cozzene above, we found that there was very little difference to the rating being displayed, despite them being calculated on another rule that fell slightly later in the logic flow. Eliminating these unnecessary logic flows also allowed us to return the result slightly faster to the user.
The update to the cross notation obviously gives you a little more information on the development of the TrueNicks rating and we hope to bring some more changes like this in weeks to come that will allow you to better understand the creation and validity of the TrueNicks rating. Thanks again for your question and if you have any follow up please don't hesitate to comment.
–Byron Rogers