Typically, data is stored using some logical clustering on one or
more columns. For example, entries in a website’s traffic log data
might be grouped by users’ physical locations, because logs are first
stored in data centers that have the best geographical proximity to
users. Within each data center, logs are append-only and are stored
in roughly chronological order. As a less obvious case, a news site’s
logs might contain news_id and timestamp columns that are
strongly correlated. For analytical queries, it is typical to apply
filter predicates or aggregations over such columns. For example,
a daily warehouse report might describe how different visitor segments
interact with the website; this type of query naturally applies
a predicate on timestamps and performs aggregations that are
grouped by geographical location. This pattern is even more frequent
for interactive data analysis, during which drill-down operations
are frequently performed.