City: |
Eindhoven,
North Brabant, Netherlands |
Organization: |
Eindhoven
University of Technology |
Project
Start Date: |
September
2000 |
Project
End Date: |
January
2009 |
Reference: |
Dekker,
G. W., Pechenizkiy, M., & Vleeshouwers, J. M. (2009). Predicting Students
Drop Out: A Case Study. Education Data Mining, 41-50. |
Problem: |
There
was a drop out rate of 40% for freshmen in the Electrical Engineering (EE)
Department of Eindhoven University of Technology. In an attempt to help lower
this number the EE department wants to identify successful and unsuccessful
students at an early stage. There are a wide range of factors that directly
relate to a student’s academic success. Helping teachers, education personnel
and management understand these factors will help support the students and
decrease drop. A solution that was used was to have the councillor give
students advice on continuing their academic degree depending on their
grades. Although the solution yielded decent results, it was deemed
unsatisfactory due to its subjective nature. It was then that data mining was
proposed to be used in an attempt to find a more robust and objective
process. |
Technical
Solution: |
·
OneR
classifier ·
Compared
two decision tree algorithms CART (SimpleCart) and C4.5 (J48) ·
Bayesian
Classifier (BayesNet) ·
Logistic
Model (SimpleLogistic) ·
Rule-Based
learner (JRip) ·
Random
Forest (RandomForest) ·
Cost
Matrix (Increase accuracy of results) |
Datasets
Used: |
|
Outcome: |
Before
the data modelling the solution was to give every enrolled student study
advice in December, based upon grades and other results of the student. The
academic data is examined by the department’s student counselor and they
advise the student on whether they should continue their program. Results are
generally accurate according to the department, but no clear accuracy number was
released. Model
provided useful results of successful or unsuccessful students with
accuracies between 75-80%. Use of cost matrix helped deal with
misclassification but did not see significant improvement. However using this
was able to find main classification issue of LinAlgAB having entries of zero
when there was no entered value (Because they didn’t necessarily fail). |
Issues
that arose: |
-During
experiment found that there was not much room for enhancement -Almost
all students being misclassified did not have a database entry for LinAlgAB
(No-entry is automatically mapped to zero) -Negative
classification can only be given after 3 years, and no guarantee that the
student who does not get his/her diploma after 3 years will be unsuccessful -Mistakes
due to classification measure |
Status: |
Terminated |
Entered
by: |
October
30, 2020: Danny Zhao, 1001533655, shibo.zhao@mail.utoronto.ca |
CEM1002,
Civil Engineering, University of Toronto
Contact: msf@eil.utoronto.ca