Once we have pruned our graph, we assign hairs to processors for parallelization. To optimize performance, we attempt to minimize the data exchanged between processors by using a greedy graph clustering algorithm. We initially assign equal numbers of hairs to each processor, creating a graph clustering. The goal is to reduce the communication cost, computed as the number of edges between the processor groups, while maintaining equal workloads. To do so, we greedily swap hairs between processors if it reduces the communication cost, iterating until we have reached a minimum, or have exceeded a maximum number of swaps (see Figure 13, right). This final clustering allows us to send less information between processors than if we had simply used all contact pairs, leading to more efficient parallelization. Note that we could use any algorithm for constructing the hair adjacency graph as long as it produces good communication patterns between processors.