Presumably, since we apply learning algorithms that rank testing examples and we use lift as the evaluation criterion, we do not need to care much about the imbalanced class distribution in the training set - the first problem mentioned in Section 4 - as long as the learning algorithms produce suitable ranking of the testing examples even if all of them are predicted as negative. Section 5.3 shows that a certain class distribution of training examples produces the best lift, compared to other distributions. As a bonus, it also dramatically reduces the size of the training set, which is the third problem mentioned in Section 4.