Why Use IRT?
IRT has gained popularity due to its advantages over the simpler measurement framework of Classical Test Theory (CTT). A primary advantage of IRT is that it offers a rigorous, yet flexible, framework for placing assessments of different items on a common scale. This holds substantial benefit when having to link the scores of multiple forms of an assessment onto a single reporting scale so that the scores have the same meaning across the different forms of the assessment (e.g., to ensure comparability of scores of different forms of a reading assessment administered across successive years). A related application of this advantage is computer adaptive testing, whereby each examinee is administered a set of items that is tailored to the examinee’s level of ability, resulting in different examinees receiving different sets of item while maintaining comparability of the final test scores. Another advantage of IRT is its capability to specify reliability specific to each examinee. Whereas reliability in CTT is summarized by a single index that is applied equally to all examinees regardless of ability level, item response theory has the flexibility to estimate reliability uniquely for each examinee. This information can be very useful when different individuals are administered different items (as in computer adaptive testing), or when building test forms with cut-scores or proficiency standards such that the forms can be built to maximize precision (minimize error) around those points on the scale. Despite its benefits, the appropriate application of IRT stronger (harder to meet) statistical assumptions than CTT, and typically requires larger sample sizes than those needed for CTT.