Recall from Section 4 that varying the number of admitted
streams increases receive-side mean packet delay, since the high
number of streams per independent stack results in infrequent realization
of the fast-path demultiplexing optimizations. By associating
only a low number of streams with each independent stack,
stream-scaled CLP should lower per-packet processing times. In
addition, the fundamental concurrency restriction under CLP is sequential
processing on individual stacks, so increasing the number
of stacks raises the level of concurrency, which should decrease
packet queueing delay.