Predicting at-risk students in a higher educational institution in Ghana for early intervention using machine learning
Date
2023
Authors
Tahiru, Fati
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Learning analytics (LA) uses data and evidence to suggest a better learning approach that suits a
particular student. This data and evidence are gathered from students’ online engagement with
systems such as Blackboard, Moodle, Sakai, eLibrary platforms, and other e-learning platforms.
LA continues to gain much attention as digitization of the learning environment is advancing. It
allows educators to analyze and interpret data correctly, setting in motion strategies that offer
points of leverage and performance for and among students. The use of predictive systems and
Early Warning Systems (EWS) in education addressed the issue of student dropouts and
suggested interventions for improving students’ performance. High dropout rates in education
continue to be a global challenge; however, EWS provide a solution to curb the menace in
education in various developed nations, such as the United States, Australia, and the United
Kingdom. Developing countries face similar problems of dropouts in the educational sector, but
not much research has been undertaken in LA to address the intervention needed to leverage the
situation. Some studies have designed models predicting student failure and success, student
attrition, student performance and final grades. Most of these studies have focused on only
virtual learning environments (VLE) datasets. Nonetheless, this study uses student “activity
logs”, “student courses”, “demographics”, and “student assessments” to design a predictive
model to identify at-risk students (ARS) from not graduating. The purpose of this study is to use
LA and Machine Learning (ML) to analyse the characteristics and behaviours of students in
order to identify those who may need support to improve their academic performance. The study
adopted the systematic literature review (SLR) approach to determine which emerging ML
tools/techniques have been applied successfully in designing predictive systems in education.
The SLR enabled the study to identify ML methods and the features that have been used in the
domain of predictive systems in education. The study used an integrated 5-step LA process and
ML workflow to predict which students are likely to dropout. Using the OULAD dataset, the
findings indicated that non-graduated students had habits of not revising the learning materials
early before the final exams. Although it was noted that both graduated and non-graduated
students access the learning materials simultaneously, variations were recorded in the habits of
assignment submission and revision patterns. Graduated students recorded higher clicks for
accessing VLE activities than non-graduated students, which signifies that the graduated students
interacted more with course activities than non-graduated students. The study also compared
different ML algorithms and determined the method that achieved the best predictive accuracy
that could be adapted in higher educational institutions. The evaluation of the models concluded
that the ensemble machine-learning methods outperformed the traditional methods. The Random
Forest ensemble learning algorithms outperformed the GB, Catboost, KNN, LG and NB on the
accuracy, precision, recall and f-1 score. The study identified important features such as “date of-assignment-submission”, “sum_clicks-of-activities”, “score on the assessment”,”date-of registration”, “date-of-assignment-submission”, “studied-credits”, and “date-the-student unregistered” for predicting students dropout in higher educational institution (HEI). The model
was trained with the important features to predict ARS and achieved an accuracy of 92% in less
time than using all the features. The research indicated that implementing LA and ML techniques
can effectively identify students at risk of withdrawing from higher education. In view of this,
the study concluded that targeted interventions can be developed to mitigate the risk of students
dropping out of school through improved learning outcomes
Description
A thesis submitted in fulfilment of the requirement for the degree of Doctor of Philosophy (PhD) in Information Technology (IT) at Durban University of Technology, Durban, South Africa, 2023.
Keywords
Machine learning, Higher education institutions, Students at risk
Citation
DOI
https://doi.org/10.51415/10321/5293