The original goal of sequencing was to determine
the precise order of nucleotides in a gene, but soon
the goal became the sequence of a small genome. A
genomeis the complete content of genetic information
in an organism, i.e. all the genes and other sequences
it contains. The first target was the genome of a
small virus called φX174, then larger plasmid and
viral genomes, then chromosomes and microbial
genomes until ultimately the complete genomes of
higher eukaryotes were sequenced (Table 1.1). In
the mid-1980s, scientists began to discuss seriously
how the entire human genome might be sequenced.
To put these discussions in context, the largest
stretch of DNA that can be sequenced in a single pass (even today) is 600– 800 nucleotides and the largest
genome that had been sequenced in 1985 was that
of the 172-kb Epstein–Barr virus (Baer et al. 1984).
By comparison, the human genome is 3000 Mb in
size, over 17,000 times bigger! One school of thought
was that a completely new sequencing methodology
would be required, and a number of different technologies were explored but with little success. Early
on, however, it was realized that existing sequencing
technology could be used if a large genome could
be broken down into more manageable pieces for
sequencing in a highly parallel fashion, and then the
pieces could be joined together again. A strategy was
agreed upon in which a map of the human genome
would be used as a scaffold to assemble the sequence.