The factors affecting human helpfulness evaluations are not well
understood. There has been a small amount of work on automatic determination of helpfulness, treating it as a classification or regression problem with Amazon helpfulness votes providing labeled
data [10, 15, 17]. Some of this research has indicated that the helpfulness votes of reviews are not necessarily strongly correlated with
certain measures of review quality; for example, Liu et al. found
that when they provided independent human annotators with Amazon review text and a precise specification of helpfulness in terms
of the thoroughness of the review, the annotators’ evaluations differed significantly from the helpfulness votes observed on Amazon