This algorithm consists of two parts [11, 12]. The first part
finds frequent itemsets, second part identifies the rules. For
finding frequent itemsets following steps are followed:
Step 1: Scan all transactions and find all frequent items that have
support above s %.Let these frequent items be L.
Step 2: Build potential sets of k items from Lk-1 by using pairs
of itemsets in Lk-1 such that each pair has the first k-2 items in
common. Now the k-2 common items and the one remaining
item from each of the two itemsets are combined to form a kitemset.
The set of such potentially frequent k itemsets is the
candidate set Ck. (For k=2, we build the potential frequent pairs
by using the frequent itemset L1 appears with every other item
in L1. The set so generated is the candidate set C2)
Step 3: Scan all transactions and find all k-itemsets in Ck that
are frequent. The frequent set so obtained is L2. The first pass of
the Apriori algorithm simply counts item occurrences to
determine the large 1-itemsets. A subsequent pass, say pass k,
consists of two phases. First, the large itemsets Lk-1 found in the
(k-1)th pass are used to generate the candidate itemsets Ck,
using the apriori-gen function. Next, the database is scanned and
the support of candidates in Ck is counted. For fast counting, we
need to efficiently determine the candidates in Ck that are
contained in a given transaction t [11, 12].
For finding rules, the following straightforward algorithm is
used. Take a large frequent itemset, say l, and find each nonempty
subset a. For every such subset a, output a rule of the
form a (l-a) if support (l) / support (a) satisfies minimum
confidence.
C. Frequent Pattern growth
FP-Growth is a two step approach which allows frequent itemset
discovered without candidate itemset generation.
Step 1: Build a compact data structure called the FP-tree. Build
using 2 passes over the data-set.
Step 2: Extracts frequent itemsets directly from the FP-tree
FP-Tree is constructed using 2 passes over the dataset
Pass-1: compresses a large database into a compact, Frequent
Pattern tree (FP-tree) structure.
Pass-2: develops an efficient, FP-tree based frequent pattern
mining.