Over the last decades, improvements in CPU speed have
outpaced improvements in main memory and disk access
rates by orders of magnitude, enabling the use of data compression
techniques to improve the performance of database
systems. Previous work describes the benefits of compression
for numerical attributes, where data is stored in compressed
format on disk. Despite the abundance of stringvalued
attributes in relational schemas there is little work
on compression for string attributes in a database context.
Moreover, none of the previous work suitably addresses the
role of the query optimizer: During query execution, data is
either eagerly decompressed when it is read into main memory,
or data lazily stays compressed in main memory and is
decompressed on demand only.