We first evaluated two algorithms, differing in whether or not a threshold was used to decide whether or not to run the validity checking program. When a threshold was in effect, we calculated the weight and checked the solution validity only if the weight did not exceed the threshold. Without a threshold every chromosome generated was evaluated both for validity and weight. Otherwise both algorithms used the GA as described above, including the domain-expert operators, checking for duplicates, opportunistic application of the “fine tuning” operators, etc. For all of these experiments we limited the computation to a fixed number of full fitness evaluations, because this computation dominates the time required.