Several outcomes are relevant to whether and how students used feedback after testing. First, regardless of the outcome of the practice test, participants sought some kind of feedback – either studying or judging – on the majority of trials. Second, when a test trial resulted in an incorrect response, students typically chose to study the definition. This choice is reasonable based on theory of self-paced study, which indicates that students restudy content that they cannot recall (e.g., Metcalfe & Kornell, 2005). However, given that few test responses were incorrect (approximately 12%), this pattern requires replication. Third, when participants recalled part or all of a definition correctly, they most frequently chose to judge their answers. Thus, students do seek feedback after testing, and their decision about the form of feedback (either studying or judging) appears to be guided by the quality of their test responses. This particular use of test outcome to guide learning can be readily explained by theory of self-paced study, which we consider in the General Discussion. Nevertheless, we did not have a priori predictions about these outcomes, so Experiments 2 and 3 will provide critical replications.
3.2.3. Do students continue to test themselves until correctly recalling each definition?
Based on experimenter scoring of test responses during practice, we computed the number of times that participants actually recalled the definitions correctly. Mean values are presented in Table 3. First, participants who regulated their own learning certainly did not continue to practice until they had correctly recalled definitions multiple times – or even one time – correctly. Second, although the Criterion 3 group did achieve a higher criterion than did the Criterion 1 group, t(58) = 3.65, d = .94, the Criterion 3 group did not meet their set criterion. Failure to meet the criterion was not entirely unexpected; prior research has shown that participants tend to be somewhat overconfident when making the idea-unit judgments that were used to track performance during practice, which would reduce the functional criterion achieved (for further discussion and evidence, see Dunlosky & Rawson, 2012). 2
Table 3.
Criterion level: mean number of times a definition was correctly recalled during practice.
Experiment/group Mean SEM
Experiment 1
SRL .78 .16
Criterion 1 .95 .13
Criterion 3 1.90 .24
Experiment 2
SRL .69 .14
SRL criterion .50 .11
Criterion 3 1.82 .26
Experiment 3
SRL 1.22 .21
SRL incentive .78 .17
Criterion 3 2.86 .20
Note. SRL = self-regulated learning. SEM = standard error of the mean.