The traditional Lancefield classification system, which is based on serotyping, has been replaced by emm typing, which has been used to characterize and measure the genetic diversity among isolates of S pyogenes. This system is based on a sequence at the 5' end of a locus (emm) that is present in all isolates. The targeted region of emm displays the highest level of sequence polymorphism known for an S pyogenes gene; more than 150 emm types have been described to date.[4] The emm gene encodes the M protein.
There are 4 major subfamilies of emm genes, which are defined by sequence differences within the 3' end, encoding the peptidoglycan-spanning domain. The chromosomal arrangement of emm subfamily genes reveals 5 major emm patterns, designated as emm patterns A through E. An example of the usefulness of emm typing is described by McGregor et al.[