Tasks in data preprocessing
Data cleaning: fill in missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies.
Data integration: using multiple databases, data cubes, or files.
Data transformation: normalization and aggregation.
Data reduction: reducing the volume but producing the same or similar analytical results.
Data discretization: part of data reduction, replacing numerical attributes with nominal ones.