The two coders jointly coded all issues of Volume 11 (1990) to investigate the level of interrater reliability in the coding process. The coders reported 95 percent agreement in their first trial. The key area of disagreement centered on how to measure sample size and data source when multiple samples were used in the same paper. It was agreed to record the size of the largest sample in a given paper and create a category for mixed primary and secondary data. Papers were classified as‘nonempirical’ if they contained no empirical data. The remaining papers were classified as empirical. A paper was classified as a case study if the sample size was one or the paper was explicitly referred to as a case study by the author. Once the coders agreed on a final coding scheme thelevel of interrater reliability rose to 100 percent in
a second trial.