We consider the problem of classifying doc-
uments not by topic, but by overall senti-
ment, e.g., determining whether a review
is positive or negative. Using movie re-
views as data, we ¯nd that standard ma-
chine learning techniques de¯nitively out-
perform human-produced baselines. How-
ever, the three machine learning methods
we employed (Naive Bayes, maximum en-
tropy classi¯cation, and support vector ma-
chines) do not perform as well on sentiment
classi¯cation as on traditional topic-based
categorization. We conclude by examining
factors that make the sentiment classi¯ca-
tion problem more challenging.