Apriori has some further parameters. If significanceLevel is set to a value between
0 and 1, the association rules are filtered based on a χ2 test with the chosen significance
level. However, applying a significance test in this context is problematic
because of the multiple comparison problem: If a test is performed hundreds of times
for hundreds of association rules, it is likely that significant effects will be found
just by chance—that is, an association seems to be statistically significant when
really it is not. Also, the χ2 test is inaccurate for small sample sizes (in this context,
small support values).
There are alternative measures for ranking rules. As well as confidence, Apriori
supports lift, leverage, and conviction, which can be selected using metricType. More
information is available by clicking More in the Generic Object Editor window.
Exercise 17.6.3. Run Apriori on the weather data with each of the four
rule-ranking metrics, and default settings otherwise. What is the top-ranked
rule that is output for each metric?
Apriori has some further parameters. If significanceLevel is set to a value between
0 and 1, the association rules are filtered based on a χ2 test with the chosen significance
level. However, applying a significance test in this context is problematic
because of the multiple comparison problem: If a test is performed hundreds of times
for hundreds of association rules, it is likely that significant effects will be found
just by chance—that is, an association seems to be statistically significant when
really it is not. Also, the χ2 test is inaccurate for small sample sizes (in this context,
small support values).
There are alternative measures for ranking rules. As well as confidence, Apriori
supports lift, leverage, and conviction, which can be selected using metricType. More
information is available by clicking More in the Generic Object Editor window.
Exercise 17.6.3. Run Apriori on the weather data with each of the four
rule-ranking metrics, and default settings otherwise. What is the top-ranked
rule that is output for each metric?
การแปล กรุณารอสักครู่..
