In this paper, we propose a new model based on the combination of active learning and semi-supervised self-training in
order to incorporate unlabelled data from the target language into the learning process. Because active learning tries to select
the most informative examples (in most cases, the most uncertain examples), these examples may be outlier, especially in
the field of sentiment classification of user’s reviews. To avoid outlier selection in the active learning technique, we considered
the density of the selected examples in the proposed method so as to choose those informative examples that had maximum
average similarity (the more representatives) in the unlabelled data. The proposed method was then applied to book
review datasets in three different languages. Results of the experiments showed that our method effectively increased the
performance levels while reduced the human labelling effort for CLSC in comparison with some of the existing and baseline
methods.