With reference to the distribution of the three function fields for user call behavior in the representative cluster, TOTALAMOUNT,MAXAMOUNT and AVGAMOUNT, it was found that users whose telecom fee was lower than NT$200 each month account for 97% of total users, and the users whose maximum amount and average amount for each call was lower than NT$10 comprises 80%, after comparing the call amount with call amount of all users using two function fields in cluster ‘‘[6]6’’.
This percentage was higherthan that for all users. When the call amount was higher, the percentage of users in the cluster was lower than the percentage of thetotal users.
Thus, it can be deduced that the users in cluster ‘‘[6]6’’were normal users, and this cluster can therefore be excluded from
late payment behavior.
After customers with the characteristics of cluster ‘‘[6]6’’ are eliminated, the percentage of late-paying customers increases.
In fact, it then comprised 15% of all users.
For users in cluster ‘‘[4]3’’, the percentage of users whose bill was lower than NT$200 each month, and the percentage of users whose maximum and average amounts per call were lower than NT$10, were both lower than the percentage for all users.
When the telephone fee was high, the percentage of users in the cluster was higher than the percentage for all users. The users in cluster ‘‘[4]3’’ displayed behavior opposite to that of users in cluster ‘‘[6]6’’, so the users in cluster ‘‘[4]3’’ were retained for further analysis.
In the second phase, a decision tree was directly used to predict and analyze the rest of the data. In this study, ten useful rules were derived. Two rules are described as follows: