2.2 Data Mining
Data mining is the process of extracting valuable information from large amounts of data [Hand et al. 2001]. The main difference between statistical analyses and data mining lies in the goal that is sought. The former is often used to verify prior hypotheses or existing knowledge in order to prove a known relationship [Moss and Atre 2003], while the latter is aimed at finding unexpected relationships, patterns, and interdependencies hidden in the data [Wang et al. 2002]. As opposed to traditional experiments designed to verify a priori hypothesis with statistical analyses, data mining uses the data itself to uncover relationships and patterns. In doing so, hidden relationships, patterns, and interdependencies can be discovered, predictive rules can be generated, and interesting hypotheses can be found. These are the advantages of data mining [Hedberg 1995; Gargano and Ragged 1999].