We have introduced an analytical model for estimating
I/O interference in Map-Reduce, which is able to predict
the performance scalability of a job, which can help with
making and analyzing scheduling decisions for a workload.
The model shown here is still a work in progress. We will
extend it to deal with non-local I/O in a cluster with multiple
nodes, which includes non-local map tasks, shuffling data
from remote nodes, and output data replication.
It is our intention that this model can be used to better understand
workload management decisions and help optimize
resource usage with mixed workloads which are common in
multi-tenant cloud systems.