Problem Statement: There have been many attempts to research the effective
assessment of writing ability, and many proposals for how this might be
done. In this sense, rater reliability plays a crucial role for making vital
decisions about testees in different turning points of both educational and
professional life. Intra-rater and inter-rater reliability of essay assessments
made by using different assessing tools should also be discussed with the
assessment processes.
Purpose of Study: The purpose of the study is to reveal possible variation or
consistency in grading essay writing ability of EFL writers by the
same/different raters using general impression marking (GIM), essay criteria
checklist (ECC), and essay assessment scale (ESAS), and discuss rater
reliability.
Methods: Quantitative and qualitative data were used to present the
discussion and implications for the reliability of ratings and the consistency of
the measurement results. The assessing tools were applied to 44 EFL
university students and 10 graders assessed the essay writing ability of the
students by using GIM, ECC, and ESAS in different occasions.
Findings and Results: The findings and results of the analyses indicated that
using general impression marking is evidently not reliable for assessing
essays. The coefficients obtained from checklist and scale assessments,
considering the correlation coefficients, estimated variance components, and
generalizability coefficients present valuable information, clearly show that
there is always variation among the results.