The most striking feature of the results for the load times in
535MB/node data set shown in Figure 1 is the difference in performance of DBMS-X compared to Hadoop and Vertica. Despite issuing the initial LOAD command in the first phase on each node in parallel, the data was actually loaded on each node sequentially. Thus,
as the total of amount of data is increased, the load times also increased proportionately. This also explains why, for the 1TB/cluster
data set, the load times for DBMS-X do not decrease as less data
is stored per node. However, the compression and housekeeping on
DBMS-X can be done in parallel across nodes, and thus the execution time of the second phase of the loading process is cut in half
when twice as many nodes are used to store the 1TB of data.