Techniques for avoiding the high memory overheads found on
many modern shared-memory multiprocessors are of increasing
importance in the development of high-performance multiprocessor
protocol implementations. One such technique is processorcache
affinity scheduling, which can significantly lower packet
latency and substantially increase protocol processing throughput
[20]. In this paper, we evaluate several aspects of the effectiveness
of affinity-based scheduling in multiprocessor network
protocol processing, under packet-level and connection-level parallelization
approaches. Specifically, we evaluate the performance
of the scheduling technique I ) when a large number of streams are
concurrently supported, 2) when processing includes copying of
uncached packet data, 3) as applied to send-side protocol processing,
and 4) in the presence of stream burstiness and source locality,
two well-known properties of network traffic. We find that
affinity-based scheduling performs well under these conditions,
emphasizing its robustness and general effectiveness in multiprocessor
network processing. In addition, we explore a technique
which improves the caching behavior and available packet-level
concurrency under connection-level parallelism, and find performance
improves dramatically