ABSTRACT
A majority of parallel applications executed on HPC clusters
use MPI for communication between processes. Most users
treat MPI as a black box, executing their programs using the
cluster’s default settings. While the default settings perform
adequately for many cases, it is well known that optimizing
the MPI environment can significantly improve application
performance. Although the existing optimization tools are
effective when used by performance experts, they require
deep knowledge of MPI library behavior and the underlying
hardware architecture in which the application will be executed.
Therefore, an easy-to-use tool that provides recommendations
for configuring the MPI environment to optimize
application performance is highly desirable. This paper addresses
this need by presenting an easy-to-use methodology
and tool, named MPI Advisor, that requires just a single execution
of the input application to characterize its predominant
communication behavior and determine the MPI con-
figuration that may enhance its performance on the target
combination of MPI library and hardware architecture. Currently,
MPI Advisor provides recommendations that address
the four most commonly occurring MPI-related performance
bottlenecks, which are related to the choice of: 1) point-topoint
protocol (eager vs. rendezvous), 2) collective communication
algorithm, 3) MPI tasks-to-cores mapping, and 4)
Infiniband transport protocol. The performance gains obtained
by implementing the recommended optimizations in
the case studies presented in this paper range from a few percent
to more than 40%. Specifically, using this tool, we were
able to improve the performance of HPCG with MVAPICH2
on four nodes of the Stampede cluster from 6.9 GFLOP/s to
10.1 GFLOP/s. Since the tool provides application-specific
recommendations, it also informs the user about correct usage
of MPI.
ABSTRACTA majority of parallel applications executed on HPC clustersuse MPI for communication between processes. Most userstreat MPI as a black box, executing their programs using thecluster’s default settings. While the default settings performadequately for many cases, it is well known that optimizingthe MPI environment can significantly improve applicationperformance. Although the existing optimization tools areeffective when used by performance experts, they requiredeep knowledge of MPI library behavior and the underlyinghardware architecture in which the application will be executed.Therefore, an easy-to-use tool that provides recommendationsfor configuring the MPI environment to optimizeapplication performance is highly desirable. This paper addressesthis need by presenting an easy-to-use methodologyand tool, named MPI Advisor, that requires just a single executionof the input application to characterize its predominantcommunication behavior and determine the MPI con-figuration that may enhance its performance on the targetcombination of MPI library and hardware architecture. Currently,MPI Advisor provides recommendations that addressthe four most commonly occurring MPI-related performancebottlenecks, which are related to the choice of: 1) point-topointprotocol (eager vs. rendezvous), 2) collective communicationalgorithm, 3) MPI tasks-to-cores mapping, and 4)Infiniband transport protocol. The performance gains obtainedโดยเหมาะแนะนำในการใช้งานกรณีศึกษาที่นำเสนอในช่วงนี้กระดาษจากกี่เปอร์เซ็นต์กว่า 40% โดยเฉพาะ ใช้เครื่องมือนี้ เราได้สามารถปรับปรุงประสิทธิภาพของ HPCG กับ MVAPICH2บนโหนของคลัสเตอร์หนีจาก 6.9 GFLOP เอสไปสี่10.1 GFLOP เอส เนื่องจากเครื่องมือแสดงเฉพาะโปรแกรมประยุกต์คำแนะนำ มันยังบอกผู้ใช้เกี่ยวกับการใช้ที่ถูกต้องของ MPI นั้น
การแปล กรุณารอสักครู่..