USING DATA MINING TO PREDICT
Although the educational level of the Portuguese population
has improved in the last decades, the statistics
keep Portugal at Europe’s tail end due to its high student
failure rates. In particular, lack of success in the
core classes of Mathematics and the Portuguese language
is extremely serious. On the other hand, the
fields of Business Intelligence (BI)/Data Mining (DM),
which aim at extracting high-level knowledge from raw
data, offer interesting automated tools that can aid the
education domain. The present work intends to approach
student achievement in secondary education using
BI/DM techniques. Recent real-world data (e.g.
student grades, demographic, social and school related
features) was collected by using school reports and questionnaires.
The two core classes (i.e. Mathematics and
Portuguese) were modeled under binary/five-level classification
and regression tasks. Also, four DM models
(i.e. Decision Trees, Random Forest, Neural Networks
and Support Vector Machines) and three input
selections (e.g. with and without previous grades) were
tested. The results show that a good predictive accuracy
can be achieved, provided that the first and/or second
school period grades are available. Although student
achievement is highly influenced by past evaluations, an
explanatory analysis has shown that there are also other
relevant features (e.g. number of absences, parent’s job
and education, alcohol consumption). As a direct outcome
of this research, more efficient student prediction
tools can be be developed, improving the quality of education
and enhancing school resource management.