HadoopDB is therefore a hybrid of the parallel DBMS and
Hadoop approaches to data analysis, achieving the performance
and efficiency of parallel databases, yet still yielding the scalability,
fault tolerance, and flexibility of MapReduce-based systems.
The ability of HadoopDB to directly incorporate Hadoop and
open source DBMS software (without code modification) makes
HadoopDB particularly flexible and extensible for performing data
analysis at the large scales expected of future workloads.