This report implements the CRISP-DM methodology when applying classification
models to the problem of identifying individuals whose salary exceeds a specified value
based on demographic information such as age, level of education and current
employment type. The process involved in the exploration, preparation, modelling and
evaluation of the datasets are described. Topics such as the application of statistical
analysis to suggest attribute usefulness, feature reduction, outlier detection, missing value
management, data bias and data transformation are discussed. The process of relative
performance analysis of the proposed classifiers is reviewed. The support of a business
objective which will use the predictive capabilities of the proposed models to target
customers is reviewed including the use of lift analysis to indicate the likely level of
return on investment and overall profitability.