locality by matching a TaskTracker to Map tasks that process data
local to it. It load-balances by ensuring all available TaskTrackers
are assigned tasks. TaskTrackers regularly update the JobTracker
with their status through heartbeat messages.
The InputFormat library represents the interface between the
storage and processing layers. InputFormat implementations parse
text/binary files (or connect to arbitrary data sources) and transform
the data into key-value pairs that Map tasks can process. Hadoop
provides several InputFormat implementations including one that
allows a single JDBC-compliant database to be accessed by all
tasks in one job in a given cluster.