As baseball starts its second week, lets reflect a bit on
how data analytics has changed the game. Not just the Moneyball phenomenon of
ranking players but also the extensive use of defensive shifts (repositioning
the infielders and outfielders for each batter) and other maneuvers. We're not
quite to the point that technology can replace managers and umpires but give it
another decade or two.
We've seen a huge increase in data analysis in sports. ESPN
ranked teams based on their use of analytics and it correlates well with how
those teams are faring. Eventually everyone will use the same learning
algorithms and games will just be a random coin toss with coins weighted by how
much each team can spend.
Steve Kettmann wrote an NYT op-ed piece Don't Let StatisticsRuin Baseball. At first I thought this was just another luddite who will be
left behind but he makes a salient point. We don’t go to baseball to watch the
stats. We go to see people play. We enjoy the suspense of every pitch, the one-on-one
battle between pitcher and batter and the great defensive moves. Maybe
statistics can tell which players a team should acquire and where the fielders
should stand but it still is people that play the game.
Kettmann worries about the obsession of baseball writers
with statistics. Those who write based on stats can be replaced by machines.
Baseball is a great game to listen on the radio for the best broadcasters don't
talk about the numbers, they talk about the people. Otherwise you might as well
listen to competitive tic-tac-toe.
XKCD got it right on sports and randomness: https://xkcd.com/904/
ReplyDeleteNate silver responds to the OPED and includes
ReplyDeleteBonus Podcast: Nate Silver Talks with Steve Kettmann
http://fivethirtyeight.com/datalab/dont-let-op-eds-ruin-baseball/
Is science more than (grant income) data ?
ReplyDeletehttp://www.timeshighereducation.co.uk/news/imperial-college-professor-stefan-grimm-was-given-grant-income-target/2017369.article
Most of the concern in the op-ed seems to be on statistics ruining sportswriting rather than ruining baseball itself. Much of sportswriting is very straightforward - and, independent of statistics, it isn't hard to do as good a job as basic sportswriters - maybe that is why sportswriting is where many reporters begin.
ReplyDeleteThis aspect can be largely automatic: See
here
and
here.
The much less emphasized concern in the op-ed is fantasy leagues dominating interest in the real games themselves and here I think that there is a point. I am surprised that you did not mention them and the op-ed piece only mentioned them in passing. Fantasy leagues are major money-makers for the leagues so they aren't going away. They may be somewhat representative in baseball but in the NFL they are so far from what is actually important for the games that they really are a distant sideshow.
Maybe the real sportswriting concern is the amount of media attention devoted to reporting expressly for those following the fantasy leagues. This clearly can be a completely automated side show.
While the column makes some fine points, I think some of the appeal of baseball (vs. other sports) is that many of these analytical elements are in the forefront: the game theory of pitching/batting, the physics of a curveball... and the statistics that can drive decisions from player recruitment to managerial decisions. This is why many people love the sport.
ReplyDeleteJust as in any field, where there is data to be gathered, there are statistics. Many of these features and statistics are meaningless. The art is in interpretation, and here is where sportswriters can add value: not by ignoring the numbers, but in using them to tell a story, just as we do in research.
Just as many CS researchers do data analysis and are unlikely to be replaced by machines, sportswriters can't be replaced by machines. The value they can provide is in meaningful interpretation of the statistics, as opposed to "mere data reporting". The writers shouldn't shy away from statistics; rather, they could figure out how to interpret them and separate signal from noise, just as we do in our research papers.