DATA PRE-PROCESSING
Data pre-processing is one of the most important steps in
KDD. It is aim is to make the chosen dataset as ‘clean’ as
possible for the later mining step [7, 8]. In this experiment,
the data preprocess are data cleaning, data integration and
transformation.
Data cleaning tasks are fill in missing value, identify
outliers and smooth out noisy data, correct inconsistent data.
The missing data caused by the student is a new student, so
there are no data value.
The next step is integrating data. Combined data from
multiple source files into a coherent database.
After that, transformating data. Every missing value
replaced by question mark ‘?’.