Interactive Analysis
of Web-Scale Datasets.
Dremel is a scalable, interactive ad hoc query system for
analysis of read-only nested data. By combining multilevel
execution trees and columnar data layout, it is capable of
running aggregation queries over trillion-row tables in
seconds. The system scales to thousands of CPUs and petabytes
of data, and has thousands of users at Google. In this
paper, we describe the architecture and implementation
of Dremel, and explain how it complements MapReducebased
computing. We present a novel columnar storage
representation for nested records and discuss experiments
on few-
thousand node instances of the system.