5. Sample selection
We use multiple public data sources to construct our sample. We collect state sponsored defined benefit pension plan
data from the PENDAT Survey of State and Local Government Employee Retirement Systems, conducted by the Public
Pension Coordinating Council, and from the Boston College Center for Retirement Research. The PENDAT pension plan
data covers the 1990 through 2000 fiscal years and the Boston College data covers the 2001 though 2009 fiscal years.
We crosscheck these data using the information from the National Association of State Retirement Administrators. When
there were discrepancies between our data sources or there was missing information,37 we collect the necessary
information directly from the Comprehensive Annual Financial Reports (CAFRs) and pension plan valuation reports. We
also collect each plan's valuation reports and CAFRs to obtain the information on the early retirement provisions and
demographics of both the inactive and active participants to implement our duration estimation procedure outlined in
Appendix A. We have 106 plans that have the information necessary to calculate funding gap understatement. After
aggregating these plans to the state level, we have 984 state-year observations.38