Couchbase and Cassandra both show scaling at about half of linear
capacity throughout. Cassandra actually scales slightly better for
the balanced workload’’. According to this study ‘‘for heavy-read
workloads (95% read, 5% write), both Couchbase and MongoDB
scale close to linearly’’ [34].
Column Family databases Cassandra and HBase show excellent
writing abilities, but its reading performance is poor, since these
two products were optimized for writing resulting to lots of concurrent
I/O when reading. Since Cassandra uses huge amounts of
memory, it has to perform lots of disk I/O in read heavy workloads,
leading to a highly decreased performance. Both Cassandra
and HBase have a better performance during execution of Updates.
MongoDB has similar structures to RDBMS and shows great
flexibility in data modeling, especially for medium and small-sized
businesses [35]. MongoDB is the best fit for read intensive applications.
7. Observation
The downsides of NoSQL
• There is no universal query language like SQL.
• Each NoSQL product does things differently.
• SQL is very powerful and expressive.
• Relational databases are very mature, 40+ years (1970) while
NoSQL are 6+ years old.
• Relational databases are part of a vast ‘‘Ecosystem’’ with lots of
applications and Tool availability.
NoSQL communities have picked up the trend to abandon
relational properties in favor of high-scalability by only supporting
Key-Value type accesses in their data stores. However, abandoning
SQL and its feature has nothing to do with Scalability. Another
significant development in advanced database technology is
‘‘Polyglot Persistence’’, i.e. using multiple data storage technologies
based upon the way data are being used by individual applications.
The simple logic behind ‘‘Polyglot Persistence’’ is why store binary
images in a relational database, when there are better storage systems
for the same?
The following are some of the issues for selection of a particular
NoSQL database:
• Fit workload requirements to the best suited cloud database
system considering the read-optimized against write-optimized
substitution.
• Latency versus Durability is another important axis. If developers
know that they can lose a small fraction of writers such as
web poll votes, etc. they can acknowledge success, writes without
waiting for them to be synced to disk. An application requiring
large number of small writes may use ‘‘Redis’’.
• Auto-completion, Caching may use Redis, Memcached.
• Data mining, Trending-MongoDB, Hadoop and BigTable.
• Content based web portals-MongoDB, Cassandra and Sharded
ACID databases.
• Financial Portals-ACID database.
8. Conclusion
In this paper an analytical study of BASE properties of NoSQL
database with a focus on large-scale NoSQL such as Dynamo,
Google File System (GFS), Bigtable and Hadoop are done. Different
techniques that are used to achieve Consistency and Availability
were analyzed. Some recommendations based on observations are
made for selection of a NoSQL for particular purposes.