Phase 2 – The next phase computes the total adRevenue and average pageRank based on the sourceIP of records generated in Phase
1. This phase uses a Reduce function in order to gather all of the records for a particular sourceIP on a single node. We use the identity Map function in the Hadoop API to supply records directly to
the split process [1, 8].