The matrix multiplication algorithm contains a large number of basic arithmetic operations, even for relatively small matrix dimension values. Thus, there are sufficient arithmetic operations to fill the pipeline, and keep it operating near 100% utilization. In this case, the CPI should be unity(1), as one instruction is completed and retired every processor clock cycle. That is, so long as each functional element in the pipeline can complete its assigned task in a single clock cycle