In this study, we will choose customer dataset by a Chinese commercial bank from its data warehouse.
Specifically, the dataset includes 50000 customer records, from January 2011 to June 2012. We take the data from
January 2011 to December 2011 as the training data, and January 2012 to June 2012 as the test data. Through
preprocessing the dataset, such as filling in missing values and removing outliers, finally there are 46,406 valid data
records. And if the customer cancels his account during the observation period we define him as a churner. Finally
we get 421 churners accounting for 0.91% and 45,985non-churners accounting for 99.09%. The ratio of churners to
non-churners is 109.23. We can see that the customer dataset is serious imbalanced.
We consider the basic attribute indicators and business indicators of customers as input variables. Among them,
the customer basic attribute indicators include age, sex, education, income, occupation, service stars, the asset-toliabilities
ratio, etc. The business indicators include deposit accounts, deposit balance, the number of deposits, and
the amount of consumption and so on.
In this study, we will choose customer dataset by a Chinese commercial bank from its data warehouse.
Specifically, the dataset includes 50000 customer records, from January 2011 to June 2012. We take the data from
January 2011 to December 2011 as the training data, and January 2012 to June 2012 as the test data. Through
preprocessing the dataset, such as filling in missing values and removing outliers, finally there are 46,406 valid data
records. And if the customer cancels his account during the observation period we define him as a churner. Finally
we get 421 churners accounting for 0.91% and 45,985non-churners accounting for 99.09%. The ratio of churners to
non-churners is 109.23. We can see that the customer dataset is serious imbalanced.
We consider the basic attribute indicators and business indicators of customers as input variables. Among them,
the customer basic attribute indicators include age, sex, education, income, occupation, service stars, the asset-toliabilities
ratio, etc. The business indicators include deposit accounts, deposit balance, the number of deposits, and
the amount of consumption and so on.
การแปล กรุณารอสักครู่..
