These summary files contain, for each hour from December 9,
2007 to present and updated in real time, a compressed text file
listing the number of requests for every article in every language,
except that articles with no requests are omitted. (This request
count differs from the true number of human views due to
automated requests, proxies, pre-fetching, people not reading the
article they loaded, and other factors. However, this commonly
used proxy for human views is the best available.) We analyzed
data from March 7, 2010 through February 1, 2014 inclusive, a
total of 1,428 days. This dataset contains roughly 34,000 data files
totaling 2.7TB. 266 hours or 0.8% of the data are missing, with
the largest gap being 85 hours. These missing data were treated as
zero; because they were few, this has minimal effect on our
analyses.