For each group of questions a benchmark sample was prepared, illustrating each score. Each student’s work was checked independently by two markers. A high degree of consistency was evident across markers in all three countries. Whenever there was disagreement between markers, this was usually resolved by the markers themselves – usually one had missed an important clue. Very rarely, such disagreements were referred to a supervising researcher. Two student responses showing very clear relational thinking are given for each group of items in Figure 2.