V. AUTOMATED BENCHMARK TESTING ENVIRONMENT
To streamline testing, and provide a consistent software environment across a variety of computing platforms and architectures, students also developed programs to control and automate the benchmarking process.
Because the system timer resolution provided by the processor and operating systems software was low, repeatedly looping the calculations in an outer accuracy enhancement loop, and then dividing the total time by the loop count enhanced execution time accuracy for small matrix dimensions. To manage benchmark data collection, a Python control program was written to perform an initialization and determination of the number of times to repeat the outer accuracy loop based on the clock frequency. Using this information, the program modifies the C++ source file with appropriate accuracy enhancement loop counts and matrix sizes, invokes the compiler, and starts the newly compiled binary in a new process. When the computation completes, it sends the timing information to the standard output and terminates. The Python program captures information from the compiler's standard output, reconfigures the source file with a larger matrix size, and repeats the process. The program automatically launches and runs the matrix multiplication benchmark for a range of matrix sizes, ranging from 4 to 64 in increments of powers of 2, using GCC compiler optimization levels-O0 and-O3. When the entire test process is complete the program outputs all timing results ready for analysis in a comma-delimited file, along with the corresponding matrix sizes and loop counts