broad use. In our study, we could detect around 68% of the experimentally defined TFBSs in conserved segments (at 65% relative matrix score threshold; see Figure 2). This differs slightly from the outcome of a study of conservation
properties proximal to TFBSs [29], which indicated that only around 50% of sites are situated in conserved regions. There are several key factors that may account for this difference. The procedures for defining the collections were different. For
by a stringent similarity threshold (> 80% identity over 40 bp). There was no exclusion of pseudogenes or paralogous genes indicated in the previous study, which would result in decreased sensitivity due to the erroneous application of phylogenetic footprinting to genes evolving under distinct evolutionary pressures.
While the work presented here focuses on mammalian sequence comparisons, there is no limitation within the ConSite system precluding studies of other organisms (the ConSite website includes samples with insect and nematode sequences). In the future it will be important to develop methods capable of analyzing multiple genomic sequences in parallel, but this is a