Purifying the Matching Variable and Sample Size
You need to keep in mind that you should “purify the matching criterion” in the
process of conducting the DIF analysis. That is, items that are identified as DIF are
omitted, and the scale or total score is recalculated. This re-calculated total score is used
as the matching criterion for a second logistic regression DIF analysis. Again, all items
are assessed. This matching purification strategy has been shown to work empirically.
Holland and Thayer (1988) note that when the purification process is used, the
item under examination should be included in the matching criterion even if it was
identified as displaying DIF on initial screening and excluded from the criterion for all
other items. That is, the item under study should always be included in its own matching
criterion score. According to Holland and Thayer this reduces the number of Type I
errors.
With regards to the matter of sample size, it has been shown in the literature that
for binary items at least 200 people per group is adequate. However, the more people per
group the better. Finally, it will aid the interpretation of the results if the sample does not
Theory and Methods of DIF 28
have missing data. That is, one should conduct the analysis only on test takers from
which you have complete data on the scale at hand (i.e., no missing data on the items
comprising the scale and the grouping variable for your analysis).
Various R-squared Measures for DIF
Table 3 lists the various R-squared measures available to measure the magnitude
of DIF. In short, the R-squared measures are used in the hierarchical sequential modeling
process of adding more terms to the regression equation. The order of entering variables
is determined by the definition of DIF.
Table 3.
R-squared Measures for DIF