Protein sequence databases
Protein sequence databases are
categorised as primary, composite or
secondary. Primary databases contain
over 300,000 protein sequences and
function as a repository for the raw
data. Some more common repositories,
such as SWISS-PROT [3] and PIR-
International [23], annotate the
sequences as well as describe the
proteins’ functions, its domain structure
and post-translational modifications.
Composite databases such as OWL
[24] and the NRDB [25] compile and
filter sequence data from different
primary databases to produce com-
bined non-redundant sets that are more
complete than the individual databases