In this paper, we studied the use of compression to improve
database performance. We observed that compressing
string attributes is important for query performance. Due to
the heterogeneous nature of string attributes, a single compression
method is inferior to our Hierarchical Dictionary
Encoding, a comprehensive strategy that chooses the most
effective encoding level for each string attribute.
In addition, we observed that the placement of string
decompression in a query plan is crucial for query performance.
A traditional optimizer enhanced with a cost model
that takes both I/O benefits of compression and the CPU
overhead of decompression into account, does not necessarily
achieve good plans. (The Two-Step algorithm is an instantiation
of this approach.) We proposed two new query