We then performed a book-filtering step to deal with “crossposting” of reviews across versions. When Amazon carries different versions of the same item — for example, different editions of the same book, including hardcover and softcover editions and audio-books — the reviews written for all versions are merged and displayed together on each version’s product page and likewise returned by the API upon queries for any individual version. This means that multiple copies of the same review exist for “mechanical”, as opposed to user-driven, reasons. To avoid including mechanically-duplicated reviews, we retained only one of the set of alternate versions for each book (the one with the most complete metadata)