Qualls-Payne (1992) used a set of criteria to evaluate six methods for computing conditional
standard errors of measurement. Three were based on the compound binomial model, two on classical test
theory (including Thorndike, as stated in the TVAP manual), and one on the three-parameter item response
theory model. The most-recommended method was judged to be Feldt’s (1984), one of the compound
binomial methods. Thorndike’s method was generally found to be the least preferred – its redeeming feature
might be that it is easier to calculate than the other five approaches (my suggestion, not Qualls-Payne’s).