Visualizing 1R
The purpose of the boundary visualizer is to show the predictions of a given model
for every possible combination of attribute values—that is, for every point in the
two-dimensional space. The points are color-coded according to the prediction the
model generates. We will use this to investigate the decision boundaries that different
classifiers generate for the reduced iris dataset.
Start with the 1R rule learner. Use the Choose button of the boundary visualizer
to select weka.classifiers.rules.OneR. Make sure you tick Plot training data; otherwise,
only the predictions will be plotted. Then click the Start button. The program
starts plotting predictions in successive scan lines. Click the Stop button once the
plot has stabilized—as soon as you like, in this case—and the training data will be
superimposed on the boundary visualization.
Exercise 17.3.1. Explain the plot based on what you know about 1R. (Hint:
Use the Explorer interface to look at the rule set that 1R generates for this
data.)
Exercise 17.3.2. Study the effect of the minBucketSize parameter on the
classifier by regenerating the plot with values of 1, and then 20, and then some
critical values in between. Describe what you see, and explain it. (Hint: You
could speed things up by using the Explorer interface to look at the rule sets.)
Now answer the following questions by thinking about the internal workings of
1R. (Hint: It will probably be fastest to use the Explorer interface to look at the
rule sets.)
Exercise 17.3.3. You saw earlier that when visualizing 1R the plot always has
three regions. But why aren’t there more for small bucket sizes (e.g., 1)? Use
what you know about 1R to explain this apparent anomaly.
Exercise 17.3.4. Can you set minBucketSize to a value that results in less than
three regions? What is the smallest possible number of regions? What is the
smallest value for minBucketSize that gives this number of regions? Explain
the result based on what you know about the iris data.
Visualizing 1R
The purpose of the boundary visualizer is to show the predictions of a given model
for every possible combination of attribute values—that is, for every point in the
two-dimensional space. The points are color-coded according to the prediction the
model generates. We will use this to investigate the decision boundaries that different
classifiers generate for the reduced iris dataset.
Start with the 1R rule learner. Use the Choose button of the boundary visualizer
to select weka.classifiers.rules.OneR. Make sure you tick Plot training data; otherwise,
only the predictions will be plotted. Then click the Start button. The program
starts plotting predictions in successive scan lines. Click the Stop button once the
plot has stabilized—as soon as you like, in this case—and the training data will be
superimposed on the boundary visualization.
Exercise 17.3.1. Explain the plot based on what you know about 1R. (Hint:
Use the Explorer interface to look at the rule set that 1R generates for this
data.)
Exercise 17.3.2. Study the effect of the minBucketSize parameter on the
classifier by regenerating the plot with values of 1, and then 20, and then some
critical values in between. Describe what you see, and explain it. (Hint: You
could speed things up by using the Explorer interface to look at the rule sets.)
Now answer the following questions by thinking about the internal workings of
1R. (Hint: It will probably be fastest to use the Explorer interface to look at the
rule sets.)
Exercise 17.3.3. You saw earlier that when visualizing 1R the plot always has
three regions. But why aren’t there more for small bucket sizes (e.g., 1)? Use
what you know about 1R to explain this apparent anomaly.
Exercise 17.3.4. Can you set minBucketSize to a value that results in less than
three regions? What is the smallest possible number of regions? What is the
smallest value for minBucketSize that gives this number of regions? Explain
the result based on what you know about the iris data.
การแปล กรุณารอสักครู่..
