Before the information is divided into smaller units, there is need
to determine the size, or granularity, of each meaningful unit. The finer the
subdivision or granularity of each unit the more tedious and time consuming
the cataloging effort will be. Let us take, for example, the case of cataloging a
book describing how to build a particular machine. There are several questions
that we need to ask. Shall we consider the entire book as one unit and catalogue
it as such. Or, shall we consider as one unit one chapter of the book, or one
section or one paragraph of the book? The larger the unit the more difficult it
is to find the exact information one is looking for. In some cases, the manner
of dividing into units presents itself as obvious. One example is an anthology
of short essays by different authors. It is obvious that it can be divided into
several units where one unit constitutes one short essay by each author. Some
products implicitly assume levels of granularity. For instance, Index Server is
based on individual words. On the other hand, databases usually work with
fields and records as units for searching and retrieval.