Roy and McCallum [8] describe a method to directly
maximize the expected error rate reduction, by estimating
the future error rate by a loss function. The loss functions
help the learner to select those instances that maximize the
confidence of the learner about the unlabeled data. Rather
than estimating the expected error on the full distribution,
this algorithm estimates it over a sample in the pool. The
authors base their class probability estimates and classification
on naive Bayes, however SVMs or other models with complex
parameter space are also recommended.
Osugi et al. [23] propose an active learning algorithm that
balances the exploration and exploitation while selecting a new
instance for labeling by the expert at each step. The algorithm
randomly chooses between exploration and exploitation at
each round and receives feedback on the effectiveness of the
exploration step, based on the performance of the classifier
trained on the explored instance.
Jain and Kapoor [25] studied active learning for large multiclass
problems. Most of the active learning algorithms are
inherently for binary classification and do not scale up to
the large number of classes. In this paper, they introduce a
probabilistic variant of the K-Nearest Neighbor method for
classification that can be seamlessly used for active learning
in multi-class scenarios.