Figure 2 contains the assembly code listing for line 52 the matrix multiplication C++ code compiled using the GNU Gcc compiler full optimization setting. It shows that the line translates into eight(8) lines of assembly, or machine code. The total number of assembly instructions, I asvembly which would be executed to compute the matrix multiplication eight(8) times the number execution iterations of line is then, or, I assembly 8. N3, where N is the matrix dimension 52, Using system time stamp calls in the C++ program, the total elapsed CPU execution time can be captured, and recorded. With execution measurement knowledge of the processor's clock frequency, the total number CPU clock cycles, which occurred during execution can be determined by multiplying Cc by processor's clock period in seconds[6] Using Ccpu, and the corresponding number of assembly code instructions needed to multiply the matrices, I assembly, the processor pipeline utilization or efficiency metric, cycles-per instruction, or CPI, can be evaluated as CPI Copu I assembly