BASE ensures at least 80% consistency at any given instant
over the flow of the operations.
(1) Dynamo operates with high consistency at the cost of
weaker availability. It is designed to be ‘‘Eventually Consistent’’
where consistency is maintained through ‘‘Object Versioning’’. As
the system aims to provide an ‘‘always writable’’ state, the conflicts
are resolved in the read operation using the versioning and
application-specific resolution protocols. When dealing with the
uncertainty of the correctness of an answer, the data are made unavailable
until it is absolutely certain that it is correct. Dynamo
points to self-repair mechanisms in the key-value store to implement
eventual consistency in practice [18].
(2) GFS: GFS also has an eventual consistency model as Dynamo.
It applies the operations log to recover the file and uses chunk version
numbers to detect any stale replicas. Chunk server uses checksum
to detect any additional data corruption due to server failure.
In case of uncertainty, the data are made unavailable. Any successful
operation of atomic_record_append guarantees consistency.
(3) Bigtable has its own features to support consistency. Each
read or write is serializable, making it easy to maintain consistency
in case of concurrent updates to the same row. Each row is copyon-write
to maintain row-level consistency.
Different cells in a table can contain multiple versions of the
same data, indexed by timestamps. SSTables contain relatively
old updates, while a buffer called memtable contains the recent
updates. Updates are recorded in a commit log to recover the
memtable. There is no synchronization of accesses required when
reading from SSTables because SSTables are immutable.
(4) Cassandra uses a weak consistency model to maintain
different replicas of an object. When a write request is received in
a node, the system routes the requests to the replicas and waits for
a quorum of replicas to acknowledge the completion of the write.
Read requests are based on the consistency guarantees required
by the client. The system can either route the request to the closest
replica or send the request to all replicas and wait until a quorum
of replicas respond.
Cassandra with TTL set up to make it completely soft state, but
this is an unusual mode of usage i.e. essentially a transient cache.
Soft state might also apply to the chat protocol within Cassandra.
New nodes can determine the state of the cluster from the chat
messages it receives and this cluster state must be constantly
refreshed to detect unresponsive nodes.
(5) In HDFS (Hadoop DFS), applications create a new file by
writing the data to it. Once the file is closed, the bytes cannot
18 D. Ganesh Chandra / Future Generation Computer Systems 52 (2015) 13–21
be modified or removed except the new data can be added to
the file by reopening the file again. HDFS implements a singlewriter,
multiple-reader model. Every time a node opens a file, it
is granted a lease for the file, no other client can write to that file.
A hit to the NameNode permits the lease to be extended. Once the
lease expires and the file is closed, the changes are available to the
readers.
In Hadoop DFS Queues are allocated a fraction of the capacity
of the grid in the sense that a certain capacity of resources will
be at their disposal. All applications submitted in the queue will
have access to the capacity allocated to the queue. Administrators
can configure soft limits and optional hard limits on the capacity
allocated to each queue.
(6) Haystack: Even when a needle (a photo) in a Haystack is
stored in all physical volumes of a logical volume, all updates go
through the same logical volume and are applied to all the replicas
of a photo. A store machine receives requests to create, modify and
delete a photo. These operations are handled by the same store
machine