in which periodically a bulk load of new data is
performed, followed by a relatively long period of ad-hoc
queries. Other read-mostly applications include customer
relationship management (CRM) systems, electronic
library card catalogs, and other ad-hoc inquiry systems. In
such environments, a column store architecture, in which
the values for each single column (or attribute) are stored
contiguously, should be more efficient. This efficiency
has been demonstrated in the warehouse marketplace by
products like Sybase IQ [FREN95, SYBA04], Addamark
[ADDA04], and KDB [KDB04]. In this paper, we discuss
the design of a column store called C-Store that includes a
number of novel features relative to existing systems.
With a column store architecture, a DBMS need only
read the values of columns required for processing a given
query, and can avoid bringing into memory irrelevant
attributes. In warehouse environments where typical
queries involve aggregates performed over large numbers
of data items, a column store has a sizeable performance
advantage. However, there are several other major
distinctions that can be drawn between an architecture that
is read-optimized and one that is write-optimized.