Lab 3: Dynamic Scheduling
The execution model on which this and all following labs are tnsed is shown in Figure 3. The full model includes branch prediction, a reorder buffer, and 5 functional units, each with its own set of reservation stations. (The functional units consist of an integer unit, a load-store unit, and three floating-point mits: an adder, a multiplier, and a divider.) The CPU employs dynamic scheduling and register renaming as found in Tomasulo's algorithm.length. Each instruction spends at least one cycle in each of these four states: fetch, issue, execution, and result distribution via the CDB. The two remaining states are added between issue and execution, and instructions spend cycles in these states only if the instruction is delayed. The states represent different reasons for waiting. One state corresponds to the instruction waiting in its reservation station for one or more of its source operands. The second corresponds to waiting for the functional unit or CDB if they are not available when the data becomes available. (Only one instruction can begin to execute on each functional unit in each cycle, and the CDB can distribute a fixed number of results each cycle. The model assumes that the CDB is reserved by every instruction before its execution begins.) Having separate states for different causes of delay makes it easy to determine the extent to which each degrades performance.
With their Lab 3 simulator, students investigate the sensitivity of performance to the number of reservation stations, the limitations of having a single CDB, and the overall improvement in performance relative to the simple pipeline in the previous lab.