3.4. Data mining implementation process
Late-paying customers comprise 7% of all customers. If a decision tree is used directly for analysis, prediction analysis will be difficult, due to the small percentage of late-paying customers.
And, in fact, if a decision tree is used directly, the result has one single node, the root node, and the prediction accuracy for normal customers is 93%.
This accuracy is high, but it fails to meet the model prediction requirements since the model prediction goal is
to find customers who may make delayed payments.
As shown in Fig. 2, clustering and decision trees are used to perform data mining in the second phase to analyze user behavior and find target customers.