HDFS is
dedicated to batch processing rather than interactive use by
users [16,12].
In HDFS applications, files are written once and
accessed many times [16,18]; consequently data coherency is
ensured and data are accessed in high throughput [16].
With
HDFS file system metadata are stored in a dedicated server,
the NameNode, and the application data in other servers called
DataNodes. Except for processing large datasets, HDFS has
many other goals whose major is to detect and handle failures
at the application layer.
This objective is realized through
a well-organized mechanism of replication where files are
divided into blocks.
Each block is replicated on a number of
datanodes; all the datanodes containing a replica of a block
are not located in the same rack.