Of course many effective compilation techniques have been developed for highly parallel code.
Vectorizing compilers, parallelizing compilers, trace scheduling, loop unrolling, and loop
jamming can all increase the accessible parallelism within code. Finally, for moderately parallel
code, techniques like loop unrolling and trace scheduling can speed up non-vectorizable
applications when running on superscalar or superpipelined machines.