Note that as explained above increases in sales rank mean lower sales, so a negative coefficient on a variable implies that an increase in that variable increases sales. The control variables used in our model include the Amazon retail price, the difference between the date of data collection and the release date of the product (Elapsed Date), the average numeric rating of the product (Rating), and the log of the number of reviews posted for that product (Number of Reviews). This is consistent with prior work such as Chevalier and Mayzlin [3] and Forman et al. [5] To account for potential non-linearities and to smooth large values, we take the log of the dependent variable and some of the control variables such as Amazon Price, volume of reviews and days elapsed consistent with the literature [5], [34]. For these regressions in which we examine
the relationship between review sentiments and product sales, we aggregate data to the weekly level. By aggregating data in this way, we smooth potential day-to-day volatility in sales rank. (As a robustness check, we also ran regressions at the daily and fortnightly level, and find that the qualitative nature of most of our results remain the same.)