Search engines are critically important to help users find relevant information on the
World Wide Web. In order to best serve the needs of users, a search engine must find
and filter the most relevant information matching a user’s query, and then present that
information in a manner that makes the information most readily palatable to the user.
Moreover, the task of information retrieval and presentation must be done in a
scalable fashion to serve the hundreds of millions of user queries that are issued every
day to a popular web search engines such as Google.
In addressing the problem of information retrieval on the web, there are a number
of challenges in which Artificial Intelligence (AI) techniques can be successfully
brought to bear. We outline some of these challenges in this paper and identify
additional problems that may motivate future work in the AI research community.
We also describe some work in these areas that has been conducted at Google.
We begin by briefly outlining some of the issues that arise in web information
retrieval that showcase its differences with research traditionally done in Information
Retrieval (IR), and then focus on more specific problems. Section 2 describes the
unique properties of information retrieval on the web. Section 3 presents a statistical
method for determining similarity in text motivated by both AI and IR methodologies