raters may be differentially stringent in scoring; raters may
tend to use some score categories more often than others; or raters’ rating behavior may drift over
time due to fatigue or other factors (Fitzpatrick, Ercikan, & Yen, 1998; Hoskens & Wilson, 2001).