In an HPC environment, usually all of the CPUs in a node are used by the application and it competes for CPU time with system services and daemons. Job launchers usually pin each process of the application to a single CPU which prevents the kernel from migrating application processes away from their CPUs, thus increasing the locality and reducing the load-balancing overheads. However, system services are usually not pinned and thus migrated often by the kernel on application CPUs for the sake of loadbalancing. This means application processes are involuntarily preempted to run system services which introduces delays to the parallel application. This can be remedied by encapsulating system services inside a set of CPUs and running the application in the remaining cores. An obvious problem with this approach is not using all of the computing power in the machine, but Petrini, et al. showed that using one less CPU per node on a 2048-node system constructed from 4 core nodes, reduced application time to solution, even with the 25% reduction in processing power [15]. This is even more prevalent now with the advent of many-core processors and the limited memory bandwidth available to applications.