consistent policy of Bigtable which forces a log page to disk incurs
a fair bit of overhead.
(4) Cassandra uses a mechanism that creates N replicas of the
same object. Each key has a coordinator node that is the in-charge
for replicating keys on N-1 nodes. Cassandra provides various
options for data replication including:
• Rack-unaware.
• Rack-aware.
• Datacenter-aware.
If replication is rack-unaware, the coordinator simply chooses
N-1 nodes from the ring. Otherwise, the system elects a leader
amongst the nodes and every joining node will be told by the leader
what ranges they have for the replicas.
(5) Hadoop DFS: The NameNode is in-charge of ensuring each
block always has the intended number of replicas. Every time a report
from a DataNode arrives, the NameNode will determine how
many replicas each block has. If a block is under-replicated, it will
get inserted in the replication priority queue. An over-replicated
block will make the NameNode to remove one replica. If a replica
resides on one rack and other replica is scheduled to be created on
the same rack, the system will find a new rack to create the replica.
(6) In Haystack, each photo is replicated throughout all physical
volumes of a store which means the same photo will be available in
as many servers as the store contains. However, the same picture
may be in the Cache and can be retrieved quickly.