Often, however, it is possible to evaluate the product of behavior
rather than the behavior itself, which is much more efficient. For
example, it is not necessary to watch a test taker paint a picture to
evaluate the finished painting. Therefore, evaluation of behavior is
generally limited to situations in which important information
would be lost by a focus on a product.
Much of what it is important to test, however, is not directly
observable at all. For example, whether or not a test taker understands
a reading passage is rarely discernible from watching the person
read. There are usually no outward manifestations that a student in
a statistics class understands the difference between the variance
and the standard deviation. The job of the test developer is to decide
what observable data would allow inferences about the unobservable
KSAs. (In a later layer, the job of the test developer is to devise tasks
that will elicit the required observable behaviors.)
Some test developers have found it useful to imagine the ideal
setting in which to gather data to support the claims to be made.
Once the ideal observation is established, the test developers
determine which parts of that observation are impossible within the
real-world constraints of the testing program. What has to be given
up? What substitutes can be made? How closely can the ideal
observation be approximated in the test?