There are also important storage distinctions between reference types and nonreference types, which might affect performance:
Storage overhead: Storing copies of a large value in multiple structured type objects may use much more space than storing the value once and referring to it elsewhere through reference type objects. This additional storage requirement can affect both disk usage and buffer management .
Clustering: The subparts of a structured object are typically stored together on disk. Objects with references may point to other objects that are far away on the disk, and the disk arm may require significant movement to assemble the object and its references together. Structured objects can thus be more efficient than reference types if they are typically accessed in their entirety.
Many of these issues also arise in traditional programming languages such as C or Pascal, which distinguish between the notions of referring to objects by value and by reference. In database design, the choice between using a structured type or a reference type will typically include consideration of the storage costs, clustering issues, and the effect of updates.
Object Identity versus Foreign Keys
Using an oid to refer to an object is similar to using a foreign key to refer to a tuple in another relation, but not quite the same: An oid can point to an object of theater_t that is stored anywhere in the database, even in a eld, whereas a foreign key reference is constrained to point to an object in a particular referenced relation. This restriction makes it possible for the DBMS to provide much greater support for referential integrity than for arbitrary oid pointers. In general, if an object is deleted while there are still oid-pointers to it, the best the DBMS can do is to recognize the situation by maintaining a reference count. (Even this limited support becomes impossible if oids can be copied freely.) Thus, the responsibility for avoiding dangling references rests largely with the user if oids are used to refer to objects. This burden some responsibility suggests that we should use oids with great caution and use foreign keys instead whenever possible.
Extending the ER Model
The definition of Probes in Figure 25.8 has two new aspects. First, it has a structured-type attribute listof(row(time, lat, long)); each value assigned to this attribute in a Probes entity is a list of tuples with three fields. Second, Probes has an attribute called videos that is an abstract data type object, which is indicated by a dark oval for this attribute with a dark line connecting it to Probes. Further, this attribute has an `attribute' of its own, which is a method of the ADT.
Alternatively, we could model each video as an entity by using an entity set called Videos. The association between Probes entities and Videos entities could then be captured by defining a relationship set that links them. Since each video is collected by precisely one probe, and every video is collected by some probe, this relationship can be maintained by simply storing a reference to a probe object with each Videos entity; this technique is essentially the second translation approach from ER diagrams.
If we also make Videos a weak entity set in this alternative design, we can add a referential integrity constraint that causes a Videos entity to be deleted when the corresponding Probes entity is deleted. More generally, this alternative design illustrates a strong similarity between storing references to objects and foreign keys; the foreign key mechanism achieves the same effect as storing oids, but in a controlled manner. If oids are used, the user must ensure that there are no dangling references when an object is deleted, with very little support from the DBMS.A signicant extension to the ER model is required to support the design of nested collections.
Using Nested Collections
Nested collections offer great modeling power, but also raise difficult design decisions. Consider the following way to model location sequences
This is a good choice if the important queries in the workload require us to look at the location sequence for a particular probe, as in the query For each probe, print the earliest time at which it recorded, and the camera type." On the other hand, consider a query that requires us to look at all location sequences: Find the earliest time at which a recording exists for lat=5, long=90." This query can be answered more efficiently if the following schema is used: