Summary scoring. Judgments about the test taker’s status with
respect to a claim are rarely made on the basis of a single task,
however. Therefore, the last of the four processes accumulates scores
across the presented tasks. Summary scoring requires the application
of a quantitative method of some sort. Scoring systems vary widely
in complexity. For example, a test that targets a single ability, with a
single score, and an evidence model in which each task connects
directly to this single ability could appropriately be scored by simply
counting the number of right answers and placing the number on a
meaningful scale. On the other hand, a diagnostic assessment that
targets multiple abilities, with many subscores, and an evidence
model in which there are multiple connections among tasks and
different KSAs would more appropriately be scored by a more
sophisticated model such as Cognitive Diagnostic Models or Bayes
nets (see, e.g., de la Torre & Minchen, this issue; Mislevy, Almond,
Yan, & Steinberg, 1999).