Repository logo
 

Early prediction of students at risk in a virtual learning environment using ensemble machine learning techniques

dc.contributor.advisorSingh, Alveen
dc.contributor.authorSoobramoney, Ranjinen_US
dc.date.accessioned2022-06-15T12:32:28Z
dc.date.available2022-06-15T12:32:28Z
dc.date.issued2021-12-13
dc.descriptionSubmitted in fulfillment of the requirements for the Degree of Masters of Information and Communication Technology, Durban University of Technology, Durban, South Africa, 2021.en_US
dc.description.abstractStudents at risk (SAR) are those students who are considered to have a higher probability of failing academically or dropping out of an academic programme. The literature reveals that SAR is a global problem at Higher Education Institutions (HEIs). A high failure rate can not only harm the reputation of the HEIs, but if left unchecked, can be detrimental to these HEIs. The problem of identifying SAR is a pervasive and persistent one. However, early identification of SAR will allow for timely and focused interventions, thereby reducing the problem. Various techniques have been used by HEIs to identify SAR. The traditional statistical approach is one such technique. One of the key challenges with this technique however, is that it often requires a large amount of manual analysis of the data to predict SAR, which in turn also makes early predictions of SAR more computationally challenging. To overcome some of the challenges of the traditional statistical approach, machine learning-based techniques have been proffered to predict SAR. Since machine learning (ML) models are based on the input data rather than the underlying problem, they are expected to have better predictive capabilities than traditional statistical models. Several ML-based techniques have been applied to predict SAR with varying degrees of success. This study proposes the use of ensemble ML techniques for early and accurate prediction of SAR using students’ demographic and weekly online Virtual Learning Environment (VLE) data. Aggregating the predictions of a group of ML classifiers is expected to provide a better generalization performance than each of the individual classifiers on their own. The use of ensemble ML techniques for this study will provide an improved solution to the problem of predicting SAR. To this end, this study focused on training forty different ML predictive models, one for each week of the semester, using twenty-five different ML classifiers. Each model was trained using students’ demographic data combined with data from their weekly interactions with a VLE. Based on the training results, four classifiers, namely AdaBoostClassifier, LGBMClassifier, RandomForestClassifier, and XGBClassifier were selected as base learners for the ensemble classifier. Hyperparameter optimization was performed using Random Search on each of the four classifiers. These classifiers were then used to create a voting classifier ensemble for each of the forty weeks, with 10-fold cross validation being used to evaluate the predictive models. The results show that the voting classifier ensemble method outperformed the individual classifiers overall over forty weeks and can thus provide an improved solution to the problem of predicting SAR.en_US
dc.description.levelMen_US
dc.format.extent126 pen_US
dc.identifier.doihttps://doi.org/10.51415/10321/4072
dc.identifier.urihttps://hdl.handle.net/10321/4072
dc.language.isoenen_US
dc.subjectStudents at Risken_US
dc.subjectEnsemble learningen_US
dc.subjectLazypredicten_US
dc.subjectMachine Learning Algorithmsen_US
dc.subjectVirtual Learning Environmenten_US
dc.subject.lcshComputer-assisted instruction--South Africaen_US
dc.subject.lcshAcademic achievementen_US
dc.subject.lcshUnderprepared college students--South Africaen_US
dc.subject.lcshWeb-based instructionen_US
dc.titleEarly prediction of students at risk in a virtual learning environment using ensemble machine learning techniquesen_US
dc.typeThesisen_US
local.sdgSDG04

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Soobramoney_R_2021_Redacted.pdf
Size:
5.14 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: