Efficient Sentiment Correlation for Large-scale Demographics
Analyzing sentiments of demographic groups is becoming important for the Social Web , where millions of users provide opinions on a wide variety of content. While several approaches exist for mining sentiments from product reviews or micro-blogs, little attention has been devoted to aggregating and comparing extracted sentiments for different demographic groups overtime , such as ‘Students in Italy ’ or ‘ Teenagers in Europe ’. This problem demands efficient and scalable methods for sentiment aggregation and correlation , which account for the evolution of sentiment values , sentiment bias, and other factors associated with the special characteristics of web data. We propose a scalable approach for sentiment indexing and aggregation that works on multiple time granularities and uses incrementally updateable data structures for online operation. Furthermore , we describe efficient methods for computing meaningful sentiment correlations,which exploit pruning based on demographics and use top-k correlations compression techniques. We present an extensive experimental evaluation with both synthetic and real datasets , demonstrating the effectiveness of our pruning techniques and the efficiency of our solution.