The point of a cluster is
to solve large computing problems on cheap commodity machines or nodes
that are built from standard parts (processor, memory, disk) as opposed to on
a supercomputer with specialized hardware. Although hundreds or thousands
of machines are available in such clusters, individual machines can
fail at any time. One requirement for robust distributed indexing is, therefore,
that we divide the work up into chunks that we can easily assign and
NODE – in case of failure – reassign. A master node directs the process of assigning
and reassigning tasks to individual worker nodes.