Superscalar machines can attain the same performance as a machine with vector hardware.
Consider the operations performed when a vector machine executes a vector load chained into a
vector add, with one element loaded and added per cycle. The vector machine performs four
operations: load, floating-point add, a fixed-point add to generate the next load address, and a
compare and branch to see if we have loaded and added the last vector element. A superscalar
machine that can issue a fixed-point, floating-point, load, and a branch all in one cycle achieves
the same effective parallelism.