Naming Objects
If a relation is fragmented and replicated, we must be able to uniquely identify each replica of each fragment. Generating such unique names requires some care. If we use a global name-server to assign globally unique names, local autonomy is compromised; we want (users at) each site to be able to assign names to local objects without reference to names system wide.
The usual solution to the naming problem is to use names consisting of several fields. For example, we could have:
A local name field, which is the name assigned locally at the site where the relation is created. Two objects at different sites could possibly have the same local name, but two objects at a given site cannot have the same local name.
A birth site field, which identifies the site where the relation was created, and where information is maintained about all fragments and replicas of the relation.
These two fields identify a relation uniquely; we call the combination a global relation name. To identify a replica (of a relation or a relation fragment), we take the global relation name and add a replica-id field; we call the combination a global replica name.
Catalog Structure
A centralized system catalog can be used but is vulnerable to failure of the site containing the catalog.This approach is not vulnerable to a single-site failure, it compromises site autonomy, just like the first solution, because every change to a local catalog must now be broadcast to all sites.
A better approach, which preserves local autonomy and is not vulnerable to a single-site failure, was developed in the R* distributed database project, which was a successor to the System R project at IBM. Each site maintains a local catalog that describes all copies of data stored at that site. In addition, the catalog at the birth site for a relation is responsible for keeping track of where replicas of the relation (in general, of fragments of the relation) are stored. In particular, a precise description of each replica's contents|a list of columns for a vertical fragment or a selection condition for a horizontal fragment|is stored in the birth site catalog. Whenever a new replica is created, or a replica is moved across sites, the information in the birth site catalog for the relation must be updated.
In order to locate a relation, the catalog at its birth site must be looked up. This catalog information can be cached at other sites for quicker access, but the cached information may become out of date if, for example, a fragment is moved.
Distributed Data Independence
Distributed data independence means that users should be able to write queries without regard to how
responsibility of the DBMS to compute the relation as needed
suitable copies of fragments, joining the vertical fragments, and taking the union of horizontal fragments.
In particular, this property implies that users should not have to specify the full name for the data objects accessed while evaluating a query. Let us see how users can be enabled to access relations without considering how the relations are distributed. The local name of a relation in the system catalog is really a combination of a user name and a user-defined relation name. Users can give whatever names they wish to their relations, without regard to the relations created by other users. When a user writes a program or SQL statement that refers to a relation, he or she simply uses the relation name. The DBMS adds the user name to the relation name to get a local name, then adds the user's site-id as the (default) birth site to obtain a global relation name. By looking up the global relation name|in the local catalog if it is cached there or in the catalog at the birth site|the DBMS can locate replicas of the relation.
A user may want to create objects at several sites or to refer to relations created by other users. To do this, a user can create a synonym for a global relation name, using an SQL-style command , and can subsequently refer to the relation using the synonym. For each user known at a site, the DBMS maintains a table of synonyms as part of the system catalog at that site, and uses this table to find the global relation name.A user's program will run unchanged even if replicas of the relation are moved, because the global relation name is never changed until the relation itself is destroyed.