The MGE contigs we collected probably
largely represent a mixture between
DNA of phages, plasmids, and transposable
elements, since all of these mobile
elements were found to be targeted by
CRISPR (Horvath and Barrangou 2010). To
specifically study phage genomes, we first
selected only MGE contigs sized 10 kb or
more (Reyes et al. 2010) since these areexpected to represent a significant portion of the phage genome
and are also less likely to appear as disjoint fragments of the same
phage.We then classified these large contigs as phage-originated if
one or more predicted ORFs showed similarity to phage-only genes
(Supplemental Data Set S4; Methods).We also included large MGE
contigs that were significantly covered by the VLP-derived sequence
data sets (Reyes et al. 2010; Minot et al. 2011), because
these were directly extracted from virus-like particles (Methods).
This classification resulted in 991 contigs, totaling 22.3 Mb, which
most probably represent genomes of gut-residing phages. The
analyses henceforth refer to these 991 phage contigs.