1 Introduction
The Internet and the Web have been growing in leaps and bounds over the past
few years, accelerating the problem of information explosion, a well-known
phenomena to all of us. According to Nature 1, the publicly indexable Web
contains an estimated 800 million pages as of February 1999. Indeed, the
growing amount of Search Engines (SEs) that have popped up everywhere,
reaching more than 2400 different SEs, enable us to access the cyberspace, but
they also flood us with vast amounts of irrelevant information. Search engine
coverage, relative to the estimated size of the publicly indexable Web, has
recently decreased substantially, with no engine indexing more than about 16%
of the estimated size of the publicly indexable Web 1.
The article is structured as follows. This section presents the resource repository
hierarchy, defines the notion of a library and the development from paper to
digital libraries. The following section classifies digital libraries, compares
between the different types and introduces the logical harvesting model. We
conclude with a discussion.