In this paper, we proposed a new model by combining active learning and self-training in order to reduce the human
labelling effort and increase the classification performance in cross-lingual sentiment classification. In this model, first, unlabeled data were translated from the target language into the source language. Following this, they were then augmented
into initial training data in the source language using active learning and self-training. We also considered a density measure
to avoid the selection of outlier examples from unlabelled data and thus increase the representativeness of the selected
examples for manual labelling in the active learning algorithm. We applied this method to cross-lingual sentiment classification
datasets in three different languages and compared the performance of the proposed model with some baseline methods.
The experimental results showed that the proposed model outperformed the baseline methods in almost all datasets.
These results also showed that the incorporation of unlabelled data from the target language can effectively improve the performance of CLSC. The experimental results further showed that employing automatic labelling, along with active learning,
can increase the speed of the learning process and therefore reduce the manual labelling workload. Experiments also demonstrated that considering the density of unlabelled data in the query function of active learning can be very efficient when
selecting the most representative and informative examples for manual labelling.