4) Performance
Lots of results were generated from extensive experiments.
The best performing models are summarized in Table II. What
should be noted is that our test set is collected totally separately from the training set. The models’ performance are thus very
different from performance results on a hold-out set in a more
common data mining sense, and should not be compared
directly with the performance level of other studies, which
usually assume no covariate shift.