Mining a Real-World Dataset
Now consider a real-world dataset, vote.arff, which gives the votes of 435 U.S.
congressmen on 16 key issues gathered in the mid-1980s, and also includes their
party affiliation as a binary attribute. This is a purely nominal dataset with some
missing values (corresponding to abstentions). It is normally treated as a classification
problem, the task being to predict party affiliation based on voting patterns.
However, association-rule mining can also be applied to this data to seek interesting
associations. More information on the data appears in the comments in the
ARFF file.
Exercise 17.6.4. Run Apriori on this data with default settings. Comment on
the rules that are generated. Several of them are quite similar. How are their
support and confidence values related?
Exercise 17.6.5. It is interesting to see that none of the rules in the default
output involve Class = republican. Why do you think that is?
Market Basket Analysis
In Section 1.3 (page 26) we introduced market basket analysis—analyzing customer
purchasing habits by seeking associations in the items they buy when visiting a store.
To do market basket analysis in Weka, each transaction is coded as an instance of
which the attributes represent the items in the store. Each attribute has only one
value: If a particular transaction does not contain it (i.e., the customer did not buy
that item), this is coded as a missing value.
Your job is to mine supermarket checkout data for associations. The data in
supermarket.arff was collected from an actual New Zealand supermarket. Take a
look at this file using a text editor to verify that you understand the structure. The
Mining a Real-World Dataset
Now consider a real-world dataset, vote.arff, which gives the votes of 435 U.S.
congressmen on 16 key issues gathered in the mid-1980s, and also includes their
party affiliation as a binary attribute. This is a purely nominal dataset with some
missing values (corresponding to abstentions). It is normally treated as a classification
problem, the task being to predict party affiliation based on voting patterns.
However, association-rule mining can also be applied to this data to seek interesting
associations. More information on the data appears in the comments in the
ARFF file.
Exercise 17.6.4. Run Apriori on this data with default settings. Comment on
the rules that are generated. Several of them are quite similar. How are their
support and confidence values related?
Exercise 17.6.5. It is interesting to see that none of the rules in the default
output involve Class = republican. Why do you think that is?
Market Basket Analysis
In Section 1.3 (page 26) we introduced market basket analysis—analyzing customer
purchasing habits by seeking associations in the items they buy when visiting a store.
To do market basket analysis in Weka, each transaction is coded as an instance of
which the attributes represent the items in the store. Each attribute has only one
value: If a particular transaction does not contain it (i.e., the customer did not buy
that item), this is coded as a missing value.
Your job is to mine supermarket checkout data for associations. The data in
supermarket.arff was collected from an actual New Zealand supermarket. Take a
look at this file using a text editor to verify that you understand the structure. The
การแปล กรุณารอสักครู่..