This paper presents an analysis of a real-world dataset w.r.t. social discrimination. In particular, we studied discrimination-aware classification by testing on an actual dataset from Statistics Netherlands, which maintains demographic,economic, and crime information of all Dutch citizens. Our results show that using a standard discriminationignorant classifier exacerbates the discrimination problem by increasing the probability difference of being a crime suspect between people from minority and from non-minority groups. Furthermore, people from the minority groups are more likely to be incorrectly classified as a crime suspect when using such methods. These results highlight the importance of discrimination-aware classifiers in practice. Among the three discrimination-aware techniques evaluated, we find that modifying the decision threshold of a Naive Bayes classifier produces good discrimination control, and that data preprocessing methods (massaging and/or reweighing) reduce discrimination for both the Naive Bayes classifier and Decision Trees. For Decision Trees, however, the reduction in discrimination is much smaller and thus are not advocated for discrimination-aware decision making. Possible explanations are that the preprocessing methods use a different (a
Naive Bayes) classifier to rank the objects, or that Decision Trees already result in less discrimination. Investigating these explanations is an interesting direction for future work.
For the Naive Bayes classifier, there is a large reduction in discrimination and we therefore recommend using this classifier on this data.