locations. A single, isolated floating-point operation is not speeded up by a pipeline.
The speedup is achieved when a vector of operands is presented to the ALU. The
control unit cycles the data through the ALU until the entire vector is processed.
The pipeline operation can be further enhanced if the vector elements are
available in registers rather than from main memory. This is in fact suggested by
Figure 17.16a.The elements of each vector operand are loaded as a block into a vec
tor register, which is simply a large bank of identical registers. The result is also
placed in a vector register. Thus, most operations involve only the use of registers,
and only load and store operations and the beginning and end of a vector operation
require access to memory.
The mechanism illustrated in Figure 17.17 could be referred to as pipelining
within an operation.That is, we have a single arithmetic operation (e.g.,
that is to be applied to vector operands, and pipelining allows multiple vector ele
ments to be processed in parallel.This mechanism can be augmented with pipelining
across operations.In this latter case, there is a sequence of arithmetic vector opera
tions, and instruction pipelining is used to speed up processing. One approach to
C=A+B)