The data set is a de-identified version of the data set used to publish HarvardX and MITx: The First Year of Open Online Courses, a report revealing findings about student demographics, course-taking patterns, certification rates, and other measures of student behavior.