A wide range of studies have demonstrated that information provided by some textual features is valuable for sentiment classification. Many types of sentence features have been proposed and tested in the literature: n-grams, part-of-speech features, location-based features, lexicon-based features, syntactic features, structural or discourse features, just to name a few.
However, there is a lack of substantive empirical evidence of their relative effectiveness with different types of texts. Some features have only been tested against product or movie reviews. Other features have only been tried with news datasets.
Furthermore, some experiments were performed in a supervised setting while others were performed in an unsupervised setting. The specific evaluation tasks performed are also diverse, e.g., subjectivity classification, polarity classification, opinion summarisation, or polarity ranking. This heterogeneous array of experimental results makes it difficult to understand what features are effective, and when and how they are best used. For instance, some linguistic cues might be beneficial in news articles (due to the nature of the language utilised by journalists) and harmful in product reviews (where customers tend to use a more direct style).
A wide range of studies have demonstrated that information provided by some textual features is valuable for sentiment classification. Many types of sentence features have been proposed and tested in the literature: n-grams, part-of-speech features, location-based features, lexicon-based features, syntactic features, structural or discourse features, just to name a few.
However, there is a lack of substantive empirical evidence of their relative effectiveness with different types of texts. Some features have only been tested against product or movie reviews. Other features have only been tried with news datasets.
Furthermore, some experiments were performed in a supervised setting while others were performed in an unsupervised setting. The specific evaluation tasks performed are also diverse, e.g., subjectivity classification, polarity classification, opinion summarisation, or polarity ranking. This heterogeneous array of experimental results makes it difficult to understand what features are effective, and when and how they are best used. For instance, some linguistic cues might be beneficial in news articles (due to the nature of the language utilised by journalists) and harmful in product reviews (where customers tend to use a more direct style).
การแปล กรุณารอสักครู่..