him as reliable, this kind of error is more dangerous and more
costly than the previous one, the model with the lowest rate of
Type I error is considered as the best model. From Table V we
remark that models M3 and M4 have the lowest rate of Type
I error, followed by models M5, M6 and M2, in other hand
models M1 and M7 have the most raised Type I error rate, It
seems that these two previous models have greater difficulty
in predicting non-reliable clients than reliable ones.
The previous misclassification rates are obtained when the
cut-off is 0.5, however changing this threshold might modify
the previous results and can allow decider to catch a greater
number of good or bad applicants. Hand [24] in his work
proposed the use of graphical tools as evaluation criterion, in
place of scalar criterion. We use in this paper the ROC (i.e.
receiver operating characteristic) curve to evaluate our seven
models, the ROC curve shows how the errors change when the
threshold varies, this kind of curve situate positives instances
against the negatives instances which allow finding the middle
ground between specificity and sensitivity.
Figure 2 shows the ROC curve of our models. The X axis
of the curve represents models’ 1 − specificity (i.e. Type
II error rate) and the Y axis represents models’ sensitivity
(i.e. 1−Type I error rate). According to Liu and Schumann
[25] a model with a ROC curve, which follows the 45◦ line
would be useless. It would classify the same proportion of not
worthy applicants and worthy cases into the not worthy class
at each value of the threshold. Figure 2 shows that the seven
models are convexes and situated over the first bisector, which
lead us to affirm that our models are statistically approved and
not useless. In Figure 2 we remark that models M3 and M4
curves appears considerably higher to the other models’ curves
which confirms our intuition about their performance, models
M1, M7 and M2 has the lowest AUC (i.e. air under curve),
from the balance between false positive and false negative
point of view these models are bad.