For each test item and for the test as a whole, we calculated and analysed confidence variables (e.g. students’ confidence levels when their responses are actually wrong and when their responses are actually correct; confidence calibration and confidence discrimination) that were not included in earlier studies featuring three-tier tests