In order to test our hypotheses 1a to 1c, we adopt a model similar to that used in [3] and [5], while incorporating measures for the quality and the content of the reviews. Chevalier and Mayzlin [3] and Forman, Ghose and Wiesenfeld [5] define the book’s sales rank as a function of a book fixed effect and other factors that may impact the sales of a book. The dependent variable is ln(SalesRank)kt, the log of sales rank of product k in time t, which is a linear transformation of the log of product demand, as discussed earlier. The unit of observation in our analysis is a product-date: since we only know the date that a review is posted (and not its time) and we observe changes in sales rank on a daily basis, we need to “collapse” multiple reviews posted on the same data in a single observation. Since we have a linear model, we use an
additive approach to combine reviews published for the same product on the same date. To study the impact of reviews and the quality of reviews on sales, we estimate the following model: