There are some major differences between web search and an application that
provides search for a collection of news stories, for example. The primary ones are
the size of the collection (billions of documents), the connections between documents
(i.e., links), the range of document types, the volume of queries (tens of
millions per day), and the types of queries. Some of these issues we have discussed
in previous chapters, and others, such as the impact of spam, will be discussed
later. In this section, we will focus on the features of the queries and documents
that are most important for the ranking algorithm.