The goal of information retrieval is to find all documents relevant for a user query in a collection of documents.
Decades of research in information retrieval were successful in developing and refining techniques that are solely
word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them
being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections
of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the
large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information
for web information retrieval as we will show in this article. This area of information retrieval is commonly
called link analysis.
Why would one expect hyperlinks to be useful? A hyperlink is a reference of a web page that is contained in
a web page . When the hyperlink is clicked on in a web browser, the browser displays page . This functionality
alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of
web pages can give them valuable information content. Typically, authors create links because they think they
will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring
the reader back to the homepage of the site, or links that point to pages whose content augments the content of
the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as
the page containing the link.