Data Cleaning:
To control any data in the data set for being a completed, correcteddata
Type of problem for cleaning
Missing Value
Noisy data
Data Cleaning: Missing Value
Missing value, is a disappear value of a record in the data set
Missing value resolving,
(1) ignore the tuple / record
(2) manually filling (expert or constant)
(3) automatic filling (statistic value)
>> attributed means / attributed mode
Data Cleaning: Noisy data
Noisy data, is a incorrect or unwanted data in the data set (error / outlier)
Noisy data resolving, data smoothing technique
>> Binning technique (means / boundaries)
Data Transformation:
To control any data in the data set for being a appropriated format
Type of data transformation
Generalization
Normalization
Discretization