Many real-world data mining tasks involve continuous
attributes. Data discretization is defined as a process of
converting continuous data attribute values into a finite set of
intervals and associating with each interval some specific data value. There are no restrictions on discrete values associated
with a given data interval except that these values must induce
some ordering on the discretized attribute domain. Data
discretization significantly improves the quality of discovered
knowledge and also reduces the running time of various data
mining tasks such as association rule discovery, classification,
and prediction [6]. Good discretization can lead to new and more
accurate knowledge. On the other hand, bad discretization leads
to unnecessary loss of information or in some cases to false
information with disastrous consequences. There are a wide
variety of discretization methods starting with the naive methods
often referred to as unsupervised methods such as equal-width,
equal-frequency and supervised methods such as Minimum
Description Length(MDL) and Pearson’s X2 or Wilks’ G2
statistics based discretization algorithms[6, 7].