The Glass Dataset
The glass dataset glass.arff from the U.S. Forensic Science Service contains data
on six types of glass. Glass is described by its refractive index and the chemical
elements that it contains; the the aim is to classify different types of glass based
on these features. This dataset is taken from the UCI datasets, which have been
collected by the University of California at Irvine and are freely available on
the Web. They are often used as a benchmark for comparing data mining
algorithms.
Find the dataset glass.arff and load it into the Explorer interface. For your own
information, answer the following exercises, which review material covered in the
previous section.
Exercise 17.2.1. How many attributes are there in the dataset? What are
their names? What is the class attribute? Run the classification algorithm
IBk (weka.classifiers.lazy.IBk). Use cross-validation to test its performance,
leaving the number of folds at the default value of 10. Recall that you can
examine the classifier options in the Generic Object Editor window that
pops up when you click the text beside the Choose button. The default
value of the KNN field is 1: This sets the number of neighboring instances
to use when classifying.