Keywords
Performance tool; performance optimization; MPI; HPC.
1. INTRODUCTION
A majority of parallel applications executed on High Performance
Computing (HPC) clusters use an MPI1
library
for communication. Implementations of MPI libraries have
parameters that may be set to optimize the performance of
a given application for a target architecture or for the characteristics
of the application. MPI library developers and
cluster system administrators try to set the default MPI parameters
to provide good performance for most applications.
However, for many applications the default parameters may
lead to performance degradation, particularly for large and
complex applications that often have long run times2
. For
these cases, choosing different MPI parameters can produce
significant application performance enhancements. However,
most users treat MPI as a black box and run their
applications using one of the available MPI libraries configured
using the cluster’s default parameters.
There are several well-established tools that target optimization
of an application’s use of MPI. Often these tools
are effective when used by performance experts and in some
cases enable comprehensive optimization of MPI library performance.
However, most users do not have the deep knowledge
of MPI library component behaviors and hardware architectures
required for effective use of these tools. And, few,
if any, of these tools provide recommendations for optimizing
parameters for a given combination of application, MPI library
implementation, and hardware architecture. For these
reasons most users do not use the existing tools and MPI
performance tuning is usually done only by performance experts.
Therefore, there is a need for an easy-to-use, lowoverhead
tool that collects and interprets performance measurements
and recommends performance-effective MPI library
parameters that enable users to tune their MPI usage.
This paper presents such a tool, MPI Advisor: a simpleto-use
tool that requires no instrumentation of the application
and is fully executed from a single command line on
1http://www.mpi-forum.org/
2Recent studies done at the Texas Advanced Computing
Center have found that thousands of jobs have resource use
patterns, which suggests that different choices of MPI parameters
could substantially improve their performance [23].