Performance Level Misclassification: Consistency and Accuracy
Going back nearly 40 years, an extensive body of research exists into mea-
surement error and criterion referenced testing (see, for example, Hambleton and
Novick (1973), Livingston and Lewis (1995) and references contained therein).
Gradually, two somewhat overlapping research approaches toward measurement
error and classification emerged: consistency and accuracy. Classification consis-
tency quantifies the extent to which two observed categorizations coincide based
upon two independent examinations; whereas classification accuracy relates the ex-
tent to which an examinee’s observed categorization matches his or her true sta-
tus. In a review of state assessment documentation and related research articles,
we find classification consistency to be the predominant method used to quan-
tify the impact of measurement error on performance level categorization. This is
likely due to its affinity with classical test reliability. Deviating from this norm,
in what follows we establish the utility of concepts associated with classification
accuracy