The data we analyzed came from two types of studies: (i) randomized
trials, where each student was randomly placed in a treatment; and (ii)
quasirandom designs where students self-sorted into classes, blind to the
treatment at the time of registering for the class. It is important to note that
in the quasirandom experiments, students were assigned to treatment as
a group, meaning that they are not statistically independent samples. This
leads to statistical problems: The number of independent data points in each
treatment is not equal to the number of students (40). The element of
nonindependence in quasirandom designs can cause variance calculations to
underestimate the actual variance, leading to overestimates for significance
levels and for the weight that each study is assigned (41). To correct for this
element of nonindependence in quasirandom studies, we used a cluster
adjustment calculator in Microsoft Excel based on methods developed by
Hedges (40) and implemented in several recent metaanalyses (42, 43).
Adjusting for clustering in our data required an estimate of the intraclass
correlation coefficient (ICC). None of our studies reported ICCs, however,
and to our knowledge, no studies have reported an ICC in college-level STEM
courses. Thus, to obtain an estimate for the ICC, we turned to the K–12
literature. A recent paper reviewed ICCs for academic achievement in
mathematics and reading for a national sample of K–12 students (44). We
used the mean ICC reported for mathematics (0.22) as a conservative estimate
of the ICC in college-level STEM classrooms. Note that although the
cluster correction has a large influence on the variance for each study, it
does not influence the effect size point estimate substantially.