Tennis has a stats problem — even at the U.S. Open.
The sport often botches even its basic match data, with missing or incorrect numbers for past matches and no archived stats online for individual matches on the women’s tour. But at the U.S. Open and the other three Grand Slam tournaments, the events’ partner, IBM, has been collecting millions of data points on matches, including stats such as winners and unforced errors that aren’t logged at many tour stops. Starting last year, U.S. Open fans have been able to track online the number of times Novak Djokovic smashes overhead winners (three times against Dr. Mikhail Youzhny on Thursday) or the number of Roger Federer forehand unforced errors (16 against Tommy Robredo on Monday).
This rich data set enables IBM, with its computing power and analytical tools, to mine the numbers for information about players’ styles. IBM has been calculating since 2011 what it calls the “Keys to the Match,” targets each player must reach to win.
There’s just one problem: Often, IBM’s keys aren’t all that key to the match.