Better Options for Science and Privacy with Respect to MOOC Data.
As illustrated in the previous section, the differences between the identified data set and the original data range from small changes in the proportion of various demographic categories to large decreases in activity variables and certification rates.
It is quite possible that analyses not yet thought of would yield even more dramatic differences between the two data sets.
Even if a identification method is found that maintains many of the observed research results from the original data set,
there can be no guarantee that other analyses will not have been corrupted by identification.
At this point it may be possible to take for granted that any standard for identification will increase overtime.
Information is becoming more accessible, and researchers are increasingly sophisticated and creative about possible re-identification strategies.