2.5 Association Rule Mining
Association Rule Mining focuses on finding rules that will predict the occurrence of
an item based on the occurrences of other items in a transaction. The fact that two
items are found to be related means co-occurrence but not causality. Note that this
technique should not be confused with rule-based classifiers presented in Sec. 2.3.3.
We define an itemset as a collection of one or more items (e.g. (Milk, Beer,
Diaper)). A k-itemset is an itemset that contains k items. The frequency of a given
itemset is known as support count (e.g. (Milk, Beer, Diaper) = 131). And the support
of the itemset is the fraction of transactions that contain it (e.g. (Milk, Beer, Diaper)
= 0.12). A frequent itemset is an itemset with a support that is greater or equal to a
minsup threshold. An association rule is an expression of the form X ⇒ Y , where
X and Y are itemsets. (e.g. Milk, Diaper ⇒ Beer). In this case the support of the
association rule is the fraction of transactions that have both X and Y . On the other
hand, the confidence of the rule is how often items in Y appear in transactions that
contain X.
Given a set of transactions T , the goal of association rule mining is to find
all rules having support ≥ minsupthreshold and con f idence ≥ mincon f threshold.