All of the collected data were lemmatized; i.e., for each item selected all inflected word forms having the same stem were listed under a base form and alphabetized with frequency of occurrence information. Proper nouns and numerals were manually excluded from each material, for “they are of high frequency in particular texts but not in others,…and they could not be sensibly pre-taught because their use in the text reveals their meaning” (Nation, 2001:
19-20).