• We formalize a novel large-scale sentiment analytics problem, focusing on the efficient aggregation of sentiment and computation of high and significant sentiment correlations between maximal demographic groups within dynamically determined time intervals.
• Specific to demographics sentiments, we describe efficient correlation pruning methods based on the demographics lattice. Furthermore, we introduce two novel methods for correlation compression, which allow for the efficient implementation of our algorithms.
• We conduct an extensive set of experiments to validate our problem, and evaluate the performance of our solution. We use synthetic datasets, which contain large-scale artificial correlations with added noise, and the MovieLens real dataset, which comes with rich user demographics. The experiments demonstrate that correlated demographic groups can be identified very efficiently with the help of our specialized indexing storage and effective pruning. Finally, our evaluation provides interesting insights on correlations among real demographic groups in MovieLens.
This paper is organized as follows. Section2 defines our framework and the problem we tackle, while Section 3 develops the properties of correlation with respect to our problem. Section 4 describes our algorithms for correlated groups and correlation compression. Our user study and performance experiments are reported in Section5.Section6 provides a summary of the related work. We conclude in Section7.