ABSTRACT
User votes are important signals in community question-answering
(CQA) systems. Many features of typical CQA systems, e.g. the
best answer to a question, status of a user, are dependent on ratings
or votes cast by the community. In a popular CQA site, Yahoo!
Answers, users vote for the best answers to their questions
and can also thumb up or down each individual answer. Prior work
has shown that these votes provide useful predictors for content
quality and user expertise, where each vote is usually assumed to
carry the same weight as others. In this paper, we analyze a set of
possible factors that indicate bias in user voting behavior – these
factors encompass different gaming behavior, as well as other eccentricities,
e.g., votes to show appreciation of answerers. These
observations suggest that votes need to be calibrated before being
used to identify good answers or experts. To address this problem,
we propose a general machine learning framework to calibrate
such votes. Through extensive experiments based on an editorially
judged CQA dataset, we show that our supervised learning method
of content-agnostic vote calibration can significantly improve the
performance of answer ranking and expert ranking.