Parameters
criterion (selection) This parameter specifies the criterion which is used for the selection of
rules.
• confidence The confidence of a rule is defined conf(X implies Y) = supp(X [Y)/supp(X)
. Be careful when reading the expression: here supp(X[Y) means “support for occurrences
of transactions where X and Y both appear”, not “support for occurrences of
transactions where either X or Y appears”. Confidence ranges from 0 to 1. Confidence
is an estimate of Pr(Y j X), the probability of observing Y given X. The support
supp(X) of an itemset X is defined as the proportion of transactions in the data set
which contain the itemset.
529
4. Modeling
• lift The lift of a rule is defined as lift(X implies Y) = supp(X [ Y)/((supp(Y) x supp(X))
or the ratio of the observed support to that expected if X and Y were independent.
Lift can also be defined as lift(X implies Y) =conf(X implies Y)/supp(Y). Lift measures
how far from independence are X and Y. It ranges within 0 to positive infinity. Values
close to 1 imply that X and Y are independent and the rule is not interesting.
• conviction conviction is sensitive to rule direction i.e. conv(X implies Y) is not same
as conv(Y implies X). Conviction is somewhat inspired in the logical definition of implication
and attempts to measure the degree of implication of a rule. Conviction is
defined as conv(X implies Y) =(1 - supp(Y))/(1 - conf(X implies Y))
• gain When this option is selected, the gain is calculated using the gain theta parameter.
• laplace When this option is selected, the Laplace is calculated using the laplace k
parameter.
• ps When this option is selected, the ps criteria is used for rule selection.
min confidence (real) This parameter specifies the minimum confidence of the rules.
min criterion value (real) This parameter specifies the minimum value of the rules for the
selected criterion.
gain theta (real) This parameter specifies the parameter Theta which is used in the Gain calculation.
laplace k (real) This parameter specifies the parameter k which is used in the Laplace function
calculation.
Tutorial Processes