Research method
In this section, we review the combination of simulation, analytic
and experimental techniques that we use to evaluate affinitybased
scheduling of parallel networking. We use a multiprocessor
simulation model that closely follows the behavior of Figure 1
An analytic model of packet processing time, which we derive
from established analytic results developed by other researchers
[25, 271, is used to capture the displacement of the cached protocol
state by the background workload. This model is formulated
to reflect the specific cache architecture and organization of our
Silicon Graphics machine. Finally, we conduct a set of multiprocessor
experiments with our parallel a-kernel implementation,
designed to measure packet processing times under specific conditions
of cache state. The analytic model is parameterized with
these experimentally-measured values.
The benefit of the overall approach is that it enables us to explore,
in a controlled manner, the performance of affinity-based
scheduling when the host concurrently executes a background,
workload of non-protocol activity. By using timing measurements
as the basis for the model of packet processing time, we demonstrate
a method which side-steps the need to identify the actual
protocol footprint (i.e., the set of cache lines referenced by the
protocol thread when it executes)-and the difficulties inherent in
capturing memory traces from a large multiprocessor application.
such as the parallelized z-kernel.
Below, we summarize the salient aspects of the approach (see
[20] for details). To facilitate presentation, we focus throughout
this section on receive-side processing. The same general method-,
ology is employed in obtaining the send-side results (Section 6).