The results of the aggregation task experiment in Figures 7 and 8 show once again that the two DBMSs outperform Hadoop. The DBMSs execute these queries by having each
node scan its local table, extract the sourceIP and adRevenue fields,
and perform a local group by. These local groups are then merged at the query coordinator, which outputs results to the user. The results
in Figure 7 illustrate that the two DBMSs perform about the same
for a large number of groups, as their runtime is dominated by the
cost to transmit the large number of local groups and merge them
at the coordinator. For the experiments using fewer nodes, Vertica
performs somewhat better, since it has to read less data (since it
can directly access the sourceIP and adRevenue columns), but it
becomes slightly slower as more nodes are used.