Date of Award

Spring 2011

Document Type


Degree Name

Doctor of Science in Information Systems


Business and Information Systems

First Advisor

Amit Deokar

Second Advisor

Surendra Sarnikar

Third Advisor

Mark Hawkes

Fourth Advisor

Mark Moran


Student retention is a major issue in higher education, since it has an impact on students, institutions, and society. Studies have shown much higher dropout rates in online courses compared to face-to-face courses. With the rapid growth in online enrollment, coupled with a higher dropout rate, more students are at risk of dropping out of online courses. Online course dropout needs to be addressed to improve institutional effectiveness and student success. Early identification of students who are at risk to drop out is imperative for preventing student dropout. Previous studies have focused on identifying students who are more likely to drop out using academic and demographic data, obtained from Student Information Systems (SIS). Online courses are generally taught using a Course Management System (CMS), which can provide detailed data about student activity in the course. There is a need to develop models that can predict real-time dropout risk for each student while an online course is being taught. Using both SIS and CMS data, a predictive model can provide a more accurate, real-time dropout risk for each student while the course is in progress. The ability to predict real-time dropout risk is helpful in early identification, which allows for early intervention to help prevent dropout. The model developed in this dissertation study utilizes a combination of variables from the SIS to provide a baseline risk score for each student at the beginning of the course. Data from the CMS is used, in combination with the baseline prediction, to provide a dynamic risk score as the course progresses. This study identifies and analyzes various SIS-based and CMS-based variables to predict dropout risk for students in online courses and evaluates various data mining techniques for their predictive accuracy and performance to build the predictive model and risk scores. The model leverages both SIS (historical) and CMS (time-variant) data to improve on the predictive accuracy. The study presents a recommender system framework, based on the predictive model, to generate alerts and recommendations for students, instructors, and staff to facilitate early and effective interventions. This study identifies deployment challenges and suggests best practices for implementation.


The author wishes to keep online access to the full-text of this dissertation restricted. A physical copy of the full-text edition may be requested through inter-library loan.