On the other hand, recent progress in genome
science has provided a wealth of genome
sequences, and several sensitive homology search
programs have been developed to find homologs
with low sequence similarity, which has pushed
the number of tertiary structures predicted in a
genome up to almost half of the encoded
proteins.16 This situation allows us to compare not
only the primary structures of the amino acid
sequences but also the three-dimensional structures
on a genome-wide scale. A topical example
of the studies involved in this newly developed
computational approach is the application for
characterizing the proteins of thermophilic bacteria.
Proteins from thermophiles are known to
have a biased amino acid composition, with an
abundance of charged residues and few polar
residues. In previous work,17 we performed a
systematic comparison between proteins from
thermophilic and mesophilic bacteria in terms of
the amino acid composition of the protein surface
and the interior. We found that the difference in
the amino acid compositions between thermophiles
and mesophiles is most obvious on the
protein surface, rather than the interior. This
characteristic of the amino acid composition
observed in the thermophilic proteins suggested