Thursday, June 14, 2012

Do 50-1 longshots in the Kentucky Derby ever come in?

(I delayed posting this until after The Belmont Stakes since I wanted to see if there would be a Triple Crown winner.
Alas, I'll have another was scratched. From what I've heard it was the right decision.)

Watching the Kentucky Derby my wife asked Do  50-1 shots every win? Since the Derby has been run 138 times it is likely that a 50-1 shot came in at least once.  That is, of course, if the odds were correctly figured.  Were they?

Essentially yes.  For the  ten longest longshots to win the derby I call a horse UNDERVALUED if they showed (came in 1st or 2nd or 3rd) in at least one of the other legs of the Triple Crown, and FLUKE if not. If there are other reason to say FLUKE I do so. Sometimes I say DON"T KNOW.

1. Donerail won in 1913 and was 91-1. (The longest long shot to win.) The prob that a 91-1 shot would NEVER come in after 138 races is (1 - 1/91)138 which is roughly 0.217647740202952.  So while this is possible, it's more likely that such a long shot would come in. I could not find information on if Donerail ran in the Preakness or the Belmont Stakes.  Even though I don't know I'll say FLUKE with odds that long.
2. Mine that Bird won in 2009 and was 50.6-1 The prob that a 50.6-1 shot would NEVER come in after 138 races is (1 - 1/50.6)138 which is roughly .0636355836570341.  Extremely unlikely, hence not at all surprising that such a horse won at some point, though surprising when it happens.  He finished second in the Preakness and third in the Belmont Stakes. UNDERVALUED!
3. Giacomo won in 2005 and was 50.3-1. Two horses that were roughly 50-1 have won the Derby.  I leave it to the reader to calculate the prob that only one 50-1 horse wins. I suspect that it is quite low, so having two of them win is reasonable.  He finished third in the Preakness and seventh in the Belmont Stakes. UNDERVALUED. I recall at the time thinking that betting he would show in the Belmont Stakes would be a sure thing. Alas, there is no such thing as a sure thing.
4. Gallahadion won in 1940 and was 35.2-1 (The website mislabels the picture as being from 2005 and misspelled the horses name by leaving out the a after the ll. What are the odds of that?) He finished third in the Preakness and did not show in the Belmont Stakes.  UNDERVALUED.
5. Charismatic won in 1999 and was 31.3-1.  He won the Preakness and came in third in the Belmont Stakes. UNDERVALUED!
6. Proud Clarion won in 1967 and was 30-1.  He finished third in the Preakness and Fourth in the Belmont Stakes. UNDERVALUED.
7. Exterminator won in 1918 and was 29.6-1.  Wikipedia has nothing on the Preakness or Belmont Stakes, so I assume he didn't run them. DON"T KNOW
8. Dark Star won in 1953 and was 24.9-1. A better name would have been Dark Horse. Was fifth in the Preakness possibly because he got injured running it, and that injury ended his career. DON"T KNOW.
9. Thunder Gulch won in 1995 and was 24.5-1. He came in third in the Preakness and won the Belmont Stakes. UNDERVALUED!
10. Stone Street won in 1908 and was 23.7-1.  Wikipedia does not have anything about his performance in the Preakness or the Belmont Stakes so I assume he didn't run in them. Wikipedia DOES report that his time, 2:15 1/5 is the slowest winner ever of the Derby. Based just on this I say FLUKE.
11. Animal Kingdom won in 2011 and was 20-1. He finished second in the Preakness and didn't show  in the Belmont Stakes (I could not find how well he did).  UNDERVALUED

1. To answer my wife's question- in 138 Derbies a 50-1 or better shot came in 3 times, which I think is about right, or at least not surprising. So they DO win.  Sometimes.
2. I was surprised that 5 of the 10 longest longshots were since 1999.  I would think people know more about racing now and so a real long shot is less likely. That may be the wrong way to think about it.
3. Should you bet on longshots? If you always bet x on the longest longshots in the Kentucky Derby, and (I have not checked this) The first, second, and third on the list above were the longest longshots when they ran, you would have roughly 190x - 135x = 55x dollars.
4. How do you tell if a long shots is undervalued (the odds should have been better) or a fluke (the odds were correct but something weird happened)? Here are three ways: (1) See how they did in the other two legs of the Triple Crown, (2) See their runtime in the Derby, or (3) for each race see if there were odd odd circumstances. For example, sometimes, it all depends if it rained last night.
5. By my measure there were six undervalued, two flukes, and two don't knows. I don't know if this means my measure is wrong.
6. Actually I'm GLAD that long shots sometimes come in. If not then they are being overvalued.  This is similar to why David Pennock wrote Why we're happy we got one prediction wrong on Super Tuesday. I saw his talk the same week as the Derby, and both jointly inspired this post. What are the odds of that?
7. I asked Dave Pennock about these issues and got two interesting thoughts from him:

1. There is evidence racetrack odds are just about right (efficient) except with a slight "favorite-longshot bias" where favorites win a little too often and longshots not quite enough (people bet a little more than they should on longshots).
2. It is an interesting hypothesis that odds should be less long with more information. I'm not sure that's true. With more information, the entropy of the distribution should go down, which intuitively might lead to longer long shots (and surer sure things)?
8. Dave also told me that there has been A LOT of work on this.  Here are some references:

1. There is a whole edited volume on the subject: Efficiency of Racetrack Betting Markets
2. Horse racing Testing the efficient markets model by Wayne W. Snyder.  J. of Finance, V. 33, No. 4, 1978.  pp. 1109--1118.
3. Probability and utility estimates for racetrack bettors by Mukhtar M. Ali. J. of Political Economy, V. 85, No. 4, 1977, pp. 803--816.
4. Utility analysis and group behavior: An empirical study by Martin Weitzman.  J. of Political Economy, V. 73, No. 1, 1865, pp. 18--26.
5. Anomalies: Parimutuel betting markets: Racetracks and lotteries by Richard H. Thaler and William T. Ziemba.  J. of Economic Perspectives, V. 2, No 2, 1988, pp. 161--174.
6. Gambling and rationality by Richard N. Rosett.  J. of Political Economy, V. 73 No. 6 pp. 595-607, 1965.
7. Informed traders and price variations in the betting market for professional basketball games by John M. Gandar, William H. Dare, Craig R. Brown, and Richard A. Zuber.
8. Information incorporation in online in-game sports betting markets by Sandip Debnath, David M. Pennock, Steve Lawrence, Eric J. Glover, and Lee Giles.  Proceedings of the Fourth Annual ACM Conference on Electronic Commerce, pages 258-259, 2003.
9. How accurate do markets predict the outcome of an event? the Euro 2000 soccer championships experiment by Carsten Schmidt and Axel Werwatz.  Technical Report 09-2002, Max Planck Institute for Research into Economic Systems, 2002.
10. Here are some fun books on the subject: Sharp Sports Betting by Wong and Calculates Bets by Skiena. I reviewed Skiena's book in my SIGACT NEWS book review column here.

1. The runner's up for this year's mathematics of gambling paper goes to How to gamble if you're in a hurry'' by Evangelos Georgiadis, Doron Zeilberger and Shalosh Ekhad. It appears that Dave's latest work/paper got inspired by it. All very interesting stuff.