Roy and McCallum [8] describe a method to directly
maximize the expected error rate reduction, by estimating
the future error rate by a loss function. The loss functions
help the learner to select those instances that maximize the
confidence of the learner about the unlabeled data. Rather
than estimating the expected error on the full distribution,
this algorithm estimates it over a sample in the pool. The
authors base their class probability estimates and classification
on naive Bayes, however SVMs or other models with complex
parameter space are also recommended.