Finally,
the necessity of crawling 100 million pages a day means that building a crawler
is an exercise in distributed computing,
requiring many computers that must work together and schedule their actions so as to get to all the pages without overwhelming any one site with too
many requests at once.