Using the training set, we trained an L2-regularized Support Vector Machine (SVM) model using the LIBLINEAR
package [9]. We used LIBLINEAR’s asymmetric cost parameter option in order to account for the extreme class imbalance between positive and negative examples in the training
set. We utilized 5-fold cross validation to find the optimal
class-specific cost parameters on the training set. The folds
were constructed at the game level so possessions in a single
game were not split across multiple folds. We chose the cost
parameters that had the maximum average Area Under the
Receiver Operating Characteristic Curve (AUROC) on the
five test folds, and used those parameters to train the final
model. The final model was then tested on the holdout set.