While experimenting with a large Hadoop cluster, we found
that the network driver configuration is critical because
cluster nodes communicate using the TCP/IP protocol. We
found that enabling interrupt coalescing [30] and Jumbo
frames [14] improved application performance. Fewer interrupts allow the CPU to do more work between interrupts,
and Jumbo frames reduce the TCP/IP overheads. Jumbo
frames are especially suited for Hadoop applications because large chunks of data are carried over the network during the shuffle phase, and also when the DataNodes are delivering data to remote TaskTracker nodes.