Osugi et al. [23] propose an active learning algorithm that
balances the exploration and exploitation while selecting a new
instance for labeling by the expert at each step. The algorithm
randomly chooses between exploration and exploitation at
each round and receives feedback on the effectiveness of the
exploration step, based on the performance of the classifier
trained on the explored instance