Threads
can be in “Wait” (as opposed to “Ready”) state due to a ITLB
miss, ICACHE miss, or their Instruction Buffer being full. The
“Least-Recently-Fetched” algorithm is used to select one of
“Ready” threads for which the next instruction will be fetched.
Fig. 7 shows the Integer/Load/Store pipeline and illustrates
how different threads can occupy different pipeline stages in
a given cycle. In other words, threads are interleaved between
pipeline stages with very few restrictions. The Load/Store
and Floating Point units are shared between all eight threads.