Description
This operator discretizes the selected numerical attributes to nominal attributes. The number of bins parameter is used to specify the required number of bins. The number of bins can also be specified by using the use sqrt of examples parameter. If the use sqrt of examples parameter is set to true, then the number of bins is calculated as the square root of the number of examples with non-missing values (calculated for every single attribute). This discretization is performed by equal frequency binning i.e. the thresholds of all bins is selected in a way that all bins contain the same number of numerical values. Numerical values are assigned to the bin representing the range segment covering the numerical value. Each range is named automatically. The naming format for the range can be changed using the range name type parameter. Values falling in the range of a bin are named according to the name of that range.
Other discretization operators are also available in RapidMiner. The Discretize By Frequency operator creates bins in such a way that the number of unique values in all bins are (almost) equal. In contrast, the Discretize By Binning operator creates bins in such a way that the range of all bins is (almost) equal.