AI in Predictive Dropout Risk Analysis in Digital Universities

Published Paper PDF: View PDF

Reena Yadav

Independent Researcher

Uttar Pradesh, India

Abstract

This study investigates the application of artificial intelligence (AI) techniques to predict student dropout risk within digital university environments. With the rapid expansion of online higher education, dropout rates have emerged as a critical challenge, undermining student success and institutional reputation. We propose a hybrid predictive framework that integrates machine learning classifiers—specifically random forests, support vector machines, and gradient boosting—with explainable AI (XAI) techniques to identify at‐risk students early in their digital learning journey. Drawing on academic records, learning management system (LMS) interaction logs, demographic data, and self‐reported motivation surveys from a sample of 250 undergraduate students across three digital universities, our research employs both supervised learning and feature‐importance analysis. The model achieved an overall accuracy of 89.4 percent and an area under the ROC curve of 0.92 in predicting dropout risk within the first eight weeks of enrollment. Key predictors included frequency of LMS access, assignment submission patterns, forum participation, and self‐efficacy scores. The use of SHAP (SHapley Additive exPlanations) provided transparent insights into individual risk profiles, enabling targeted interventions.

Building on these findings, we conducted an in‐depth qualitative review with faculty and student support staff to map the practical implications of the predictive outputs. Workshops revealed that advisors find the XAI visualizations particularly effective for guiding one‑on‑one coaching sessions, permitting real‑time adjustments to learning plans. Furthermore, we simulated intervention strategies—academic reminders, peer‑mentoring cohorts, and adaptive learning modules—and observed projected retention improvements of up to 15 percent over a semester. This multi‑pronged evaluation underscores the transformative potential of AI‑driven analytics not only to forecast dropout risk but also to drive evidence‑based support mechanisms. By combining robust predictive accuracy with interpretability and stakeholder engagement, our approach offers a scalable blueprint for digital universities seeking to enhance student success and institutional resilience.

Keywords

AI predictive analytics, student dropout risk, digital universities, machine learning classifiers, explainable AI

References

https://www.wipro.com/content/dam/nexus/en/service-lines/analytics/infographics/leveraging-ai-predictive-analytics-in-healthcare-fig1-desktop.jpg
https://images.template.net/37655/University-Management-System-Flowchart-1.jpg
Brown, M., Dehoney, J., & Millichap, N. (2015). The Next Generation Digital Learning Environment: A Report on Research. EDUCAUSE.
Dekker, G. W., Pechenizkiy, M., & Vleeshouwers, J. M. (2009). Predicting student drop–out: A case study. International Conference on Educational Data Mining, 41–50.
Ferguson, R., & Clow, D. (2017). Where is the evidence? A call to action for learning analytics. Proceedings of the Seventh International Learning Analytics & Knowledge Conference, 56–65. https://doi.org/10.1145/3027385.3027395
Huang, C., Hsieh, C., & Shih, M. (2018). Early warning algorithm of student dropouts in distance learning: A case study. Computers in Human Behavior, 100, 348–362. https://doi.org/10.1016/j.chb.2018.08.050
Joksimović, S., Gašević, D., Loughin, T. M., Kovanović, V., & Hatala, M. (2015). Learning at distance: Effects of interaction traces on academic achievement. The Internet and Higher Education, 27, 1–8. https://doi.org/10.1016/j.iheduc.2015.06.002
Kovačević, A., & Serenko, A. (2019). A comparative study of machine learning methods for prediction of student dropout risk in MOOCs. IEEE Transactions on Learning Technologies, 12(2), 1–13. https://doi.org/10.1109/TLT.2018.2873435
Li, N., Kidzinski, Ł., & Jermann, P. (2015). MOOC video interaction patterns and their relationship to student performance. eLearning Papers, (47), 1–8.
Lu, X., Hou, L., & Huang, K. (2018). Logistic regression analysis of demographic factors and dropout rates in online higher education. Journal of Educational Technology & Society, 21(1), 34–45.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4765–4774.
Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588–599. https://doi.org/10.1016/j.compedu.2009.09.008
Romero, C., & Ventura, S. (2013). Data mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(1), 12–27. https://doi.org/10.1002/widm.1075
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. https://doi.org/10.1145/2939672.2939778
Shang, H., Cheng, Y., & Ye, Q. (2019). A hybrid model for predicting MOOC dropouts using ensemble learning. Educational Technology Research and Development, 67(2), 321–339. https://doi.org/10.1007/s11423-018-9621-z
Tempelaar, D. T., Rienties, B., & Giesbers, B. (2014). In search for the most informative data for feedback generation: Learning analytics in a data‑rich context. Computers in Human Behavior, 31, 100–111. https://doi.org/10.1016/j.chb.2013.10.058
Wise, A. F., Zhao, Y., & Hausknecht, S. N. (2019). Incorporating learning analytics into the design of AI‑enhanced adaptive educational games. Journal of Learning Analytics, 6(1), 1–16. https://doi.org/10.18608/jla.2019.61.1
Yudelson, M. V., Brooks, C., & D’Mello, S. K. (2013). Automatic detection of learner’s cognitive-affective states in learning environments. User Modeling and User-Adapted Interaction, 23(1), 87–125. https://doi.org/10.1007/s11257-012-9126-9