Hayes-Roth Data Set
An example of a multivariate data type classification problem using Neuroph
by Sava Necin, Faculty of Organisation Sciences, University of Belgrade
an experiment for Intelligent Systems course
Introduction
In this example we will be testing Neuroph 2.4 with Hayes-Roth Data Set , which can be found : here. Several architectures will be tried out, and it will be determined which ones represent a good solution to the problem, and which ones do not.
First here are some useful information about our Hayes-Roth Data Set:
Data Set Characteristics: Multivariate
Number of Instances: 160
Attribute Characteristics: Categorical
Number of Attributes: 5
Associated Tasks: Classification
Introducing the problem
This database contains 5 numeric-valued attributes. Only a subset of 3 are used during testing (the latter 3)l.
Some instances could be placed in either category 0 or 1. I've followed the authors' suggestion, placing them in each category with equal probability. I've replaced the actual values of the attributes (i.e., hobby has values chess, sports and stamps) with numeric values. I think this is how the authors' did this when testing the categorization models described in the paper. I find this unfair. While the subjects were able to bring background knowledge to bear on the attribute values and their relationships, the algorithms were provided with no such knowledge. I'm uncertain whether the 2 distractor attributes (name and hobby) are presented to the authors' algorithms during testing. However, it is clear that only the age, educational status, and marital status attributes are given during the human subjects' transfer tests.
Attribute Information:
-- 1. name: distinct for each instance and represented numerically
-- 2. hobby: nominal values ranging between 1 and 3
-- 3. age: nominal values ranging between 1 and 4
-- 4. educational level: nominal values ranging between 1 and 4
-- 5. marital status: nominal values ranging between 1 and 4
-- 6. class: nominal value between 1 and 3
For this experiment to work we had to transform our data set in binary format (0, 1).We replaced each attribute value with suitable binary combination.
In this example we will be using 80% of data for training the network and 20% of data for testing it.
Before you start reading our experiment we suggest to first get more familiar with Neuroph Studio and Multi Layer Perceptron.You can do that by clicking on the links below:
Neuroph Studio Geting started
Multi Layer Perceptron
Network design
Here you can see the structure of our network with its inputs,outputs and hidden neurons in the middle layer.