8. CONCLUSION AND FUTURE WORK
Big Data analytics usually require a large amount of fast random access memory and
computation. When a working set spills over the DRAM capacity of a cluster and
starts accessing disk storage, the whole performance of the cluster falls sharply. A
natural solution is building a large enough cluster with enough collective DRAM to
accommodate the working set. Such a cluster often becomes prohibitively large, in
terms of both capital and operational cost. It also becomes difficult to run software
that makes efficient use of the total computation capabilities of a large cluster. Flash
storage is an attractive alternative to DRAM in this regard, due to its fast random
access performance, low power consumption, and low cost per GB. However, flash
storage, packaged as off-the-shelf SSDs, suffer performance penalties in order to be
backward compatible with older magnetic disk storage devices.