Using Coru, and the corresponding number of assembly code instructions needed to multiply the matrices, I assembly, the processor pipeline utilization or efficiency metric, cycles-per- instruction, or CPI, can be evaluated as CPI Ccpu I assembly. Cycles-per-instruction(CPI is an important processor analyzing their collected benchmark performance data across a range of matrix sizes, and on different processors. From CPI they can determine and quantify the utilization of the processors resources, and use their results to qualitatively imply hardware characteristics of the pipeline structure and capabilities of individual execution resources. CPI values can be obtained as an output parameter by compiling and executing a benchmark using a sophisticated programming tool such as Intel's Parallel Studio XE[7], but the exercise of using measured benchmark data and a simple spreadsheet analysis to evaluate CPI, appeared to reinforce architectural concepts in student's minds.