In the past, nearest neighbor algorithms for learning from examples have worked best in domains in
which all features had numeric values. In such domains, the examples can be treated as points and distance metrics
can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required.
We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates
distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances
to further modify the structure of feature space. We show that this technique produces excellent classification
accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary
structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons
with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior in all
three domains. In addition, our algorithm has advantages in training speed, simplicity, and perspicuity. We conclude
that experimental evidence favors the use and continued development of nearest neighbor algorithms for
domains such as the ones studied here.