Because the MR model does not have an
inherent ability to join two or more disparate data sets, the MR program that implements the join task must be broken out into three
separate phases. Each of these phases is implemented together as a
single MR program in Hadoop, but do not begin executing until the
previous phase is complete.