RCMP is efficient. It recomputes only the minimum number of tasks necessary for each recomputed job. For this, RCMP persists across jobs mapper outputs as well as reducer outputs that are part of successfully completed intermediate jobs. On failures that cause data loss, RCMP decides which jobs must be recomputed and based on the persisted data it also deter- mines the minimum number of tasks to recompute for each recomputed job. RCMP’s capability to maximize data reuse is shared by previous work in programming languages [20], [17] or cloud computing (Nectar [14], RDD [27]).