IT IS an idea any overworked teacher would welcome – computers that automatically mark piles of exams and homework. Tens of thousands of students around the US are already being evaluated by such systems. But can we trust the artificial intelligence that powers them to make appropriate judgements? Two new real-world tests suggest that it can work surprisingly well.
In one experiment, conducted at the University of Central Florida in Orlando between January and May this year, Pam Thomas tracked the impact of an automated grading system on the performance of the 1000 or so students in her first-year biology class.
The students answered questions using SAGrader, an online service developed by Idea Works of Columbia, Missouri. SAGrader parsed their answers, which could be several paragraphs long, using artificial intelligence techniques designed to extract meaning from text. For example, given the phrase “the heart pumps blood”, SA Grader would identify two entities in the text – “heart” and “blood” – and the relationship between them. These details were then analysed using a rubric compiled by Thomas.
She says her students like SAGrader because it provides feedback on their work in less than a minute. They can then resubmit their work, having taken the feedback into account, and may gain a higher score.
The software also had a fringe benefit. In end-of-term multiple-choice tests, the average mark scored by Thomas’s students was 12.5 per cent higher than in previous years. “We were amazed,” she says. She attributes the improvement to SAGrader encouraging students to work through problems multiple times. “It taught them how to think through the test questions