We next adapted the code of Sorokina et al. [20] to identify those pairs of reviews of different products that have highly similar text. To do so, we needed to decide on a similarity threshold that determines whether or not we deem a review pair to be “plagiarized”. A reasonable option would have been to consider only reviews with identical text, which would ensure that the reviews in the pairs had exactly the same text quality. However, since the reviews in the analyzed pairs are posted for different products, it is normal to expect that some authors modified or added to the text of the original review to make the “plagiarized” copy better fit its new context. For this reason, we employed a threshold of 70% or more nearlyduplicate sentences, where near-duplication was measured via the code of Sorokina et al. [20].10 This yielded 8, 313 “plagiarized” pairs; an example is shown in Figure 4. Manual inspection of a sample revealed that the review pairs captured by our threshold indeed seem to consist of close copies