5. Storage Management
The storage management issue is the allocation of
segments to nodes in a grid system; C-Store will perform
this operation automatically using a storage allocator. It
seems clear that all columns in a single segment of a
projection should be co-located. As noted above, join
indexes should be co-located with their “sender”
segments. Also, each WS segment will be co-located with
the RS segments that contain the same key range.
Using these constraints, we are working on an
allocator. This system will perform initial allocation, as
well as reallocation when load becomes unbalanced. The
details of this software are beyond the scope of this paper.
Since everything is a column, storage is simply the
persistence of a collection of columns. Our analysis
shows that a raw device offers little benefit relative to
today’s file systems. Hence, big columns (megabytes) are
stored in individual files in the underlying operating
system.
6. Updates and Transactions
An insert is represented as a collection of new objects
in WS, one per column per projection, plus the sort key
data structure. All inserts corresponding to a single
logical record have the same storage key. The storage
key is allocated at the site where the update is received.
To prevent C-Store nodes from needing to synchronize