Many real-world systems implement a streaming strategy of partitioning the input data into fixed-size segments that are processed by MapReduce platform
The disadvantage of this approach is that the latency is proportional to the length of the segment plus the overhead required to do the segmentation and initiate the processing jobs