I selected games for the experiment from the 2011 and 2012
Major League Baseball seasons using several criteria. I fit an
existing statistical model [24] for assessing the probability of
a home victory to games from these seasons. This model estimates
the probability that a home team will win using the relative
strength of each team in three categories: winning percentage,
the Earned Run Average of the starting pitcher, and
Batting Average. The model also includes an adjustment for
home field advantage. This model proved useful for this purpose
because it estimates the approximate difficulty of predicting
a given game using only a small number of statistical
categories. Since users are not able or likely to consider a
large amount of data without the aid of a sophisticated tool,
this model estimates probabilities in a fashion similar to how
we might expect users to form predictions.