To estimate the impact of long-run exposure to pollution, the
location-level panel data are collapsed to a 125-observations, location-
level, cross-sectional dataset, because the Huai River regression
discontinuity design is fundamentally a cross-sectional
design. This data file is obtained by averaging the annual locationspecific
measures of mortality rates, life expectancies, pollution
concentrations, weather variables, and other covariates across the
available years. Additionally, we used a geographic information
system to identify the degrees latitude that each city centroid is
north of the Huai River line and merged this information into the
final dataset. SI Appendix provides more details on the procedure
used to collapse the data file and the data sources.