In this paper, we apply CDW and its variants to a collection of classication
tasks, and present evidence that the optimal amount of locality for feature weighting varies between the dierent tasks. For nine of the eleven data sets used,
at least one of the CDW weighting schemes signicantly improved classication
accuracy. (In the other two, none of the
uctuation in results was statistically
signicant.) With k = 1, the most local technique tested produced the most
accurate results for seven of the nine tasks for which results varied signicantly,
but showed signicantly lower accuracies for the remaining two tasks. Given the
variability, we conclude that it is advantageous to have a family of related techniques
(like the CDW family): the best method for each task can be selected via
cross-validation on the training cases, or can be based in many cases on relatively
simple properties of the task (e.g., the presence of irrelevant features)
In this paper, we apply CDW and its variants to a collection of classi cationtasks, and present evidence that the optimal amount of locality for feature weighting varies between the di erent tasks. For nine of the eleven data sets used,at least one of the CDW weighting schemes signi cantly improved classi cationaccuracy. (In the other two, none of the uctuation in results was statisticallysigni cant.) With k = 1, the most local technique tested produced the mostaccurate results for seven of the nine tasks for which results varied signi cantly,but showed signi cantly lower accuracies for the remaining two tasks. Given thevariability, we conclude that it is advantageous to have a family of related techniques(like the CDW family): the best method for each task can be selected viacross-validation on the training cases, or can be based in many cases on relativelysimple properties of the task (e.g., the presence of irrelevant features)
การแปล กรุณารอสักครู่..
