In our experiments, the initial training set of COAL consists
of 300 randomly generated design configurations which have been labeled by cycleaccurate
simulations on the processor simulator. The number of iterations (t in Algorithm
1) in which SSL+AL is carried out is set to 100, which implies that 100 extra
unlabeled design configurations, selected in the AL process, will be labeled by cycleaccurate
processor simulations; 200 unlabeled design configurations will be labeled by
the regression trees in the cotraining SSL process. The entire unlabeled data set (U
in Algorithm 1) consists of 100K randomly generated unlabeled design configurations,
which are prepared for both SSL and AL processes. For the SSL, the size of pool containing
unlabeled design configurations (p in Algorithm 1) is set to 100 as in [Guo et al.
2011]. Moreover, M1 and M2 (the minimal numbers of examples in each leaf of two
trees, respectively) are set to 4 and 10, respectively, to obtain two diverse M5P regression
trees. In order to test the performance of COAL, additional 100 different design
configurations are simulated as the testing data.We did not finely tune the parameters
because our task is quite different from common machine learning problems owing to
the great simulation cost. For example, simulating a design configuration on a SPEC
benchmark taking the ref input may consume hundreds of hours.