Using Survival Analysis to Identify the Factors that Mitigate Attrition among Adult Learners with Low Literacy Skills in an ITS-based Literacy Program

Shi, Genghu; Peng, Shun; Greenberg, Daphne; Frijters, Jan; Graesser, Arthur

doi:10.5281/zenodo.15870298

Genghu Shi

Guangxi Normal University

Guilin, China

genghushi@foxmail.com

Shun Peng

Huanggang Normal University

Huanggang, China

ps_psy@163.com

Daphne Greenberg

Georgia State University

Atlanta, USA

dgreenberg@gsu.edu

Jan Frijters

Brock University

St. Catharines, Canada

jan.frijters@brocku.ca

Arthur C. Graesser

University of Memphis

Memphis, USA

art.graesser@gmail.com

Do not delete, move, or resize this block. If the paper is accepted, this block will need to be filled in with reference information.

ABSTRACT

Adult literacy in the U.S. remains a persistent challenge. Alarmingly, half of adults demonstrated literacy skills at or below basic proficiency levels. This deficiency significantly impacts the daily functioning, workplace success, health outcomes, and socioeconomic disparities. Intelligent tutoring systems (ITS) serve as a promising solution for improving adult learners’ literacy skills. In practice, however, we observed that learner attrition has taken a toll on the effectiveness of a literacy program we developed that includes an ITS. To identify the factors that may mitigate attrition among these learners, this study employed the survival analysis with 205 adult learners from both the U.S. and Canada who participated in our adult literacy program with an ITS. The results showed that the average number of lesson attempts by these learners, regardless of lesson completion, and their public assistance status may reduce the risk of dropping out of the program. Survival Analysis also indicated that Black learners faced elevated attrition risks. The findings have important implications for optimizing Intelligent Tutoring System (ITS)-based adult literacy programs: 1) Short-term persistence, operationalized through strategies such as lesson repetition and review, may enhance long-term retention; 2) future programs must adopt culturally responsive frameworks tailored to the specific needs of Black learners, who demonstrated elevated attrition risks, to address systemic disparities in engagement and retention; 3) program administrators should collaborate with social services to connect learners with public assistance resources, thereby mitigating socioeconomic barriers that compete with learning time.

Keywords

Adult literacy, ITS, learner attrition, short-term persistence, public assistance

INTRODUCTION

The status of adult literacy in the United States is concerning, with significant portions of the population struggling with basic literacy skills. According to the Program for the International Assessment of Adult Competencies (PIAAC), half of U.S. adults score at or below Level 2 in literacy and numeracy, indicating the limited literacy proficiency of these adults [13]. These low literacy levels are particularly prevalent among individuals without a high school education [14]. Other research showed that, unfortunately, the performance of U.S. adults has not improved significantly since the 1990s [10]. In international comparisons, the U.S. lags behind many other countries in the first round of PIAAC (2012) [10]. Low literacy skills affect various aspects of daily life, including the ability to read, write, compute, and use technology, which are essential for functioning effectively in society [9]. In the workplace, these limitations hinder job opportunities and career advancement. In healthcare, over one-third of the U.S. population lack the skills needed to make appropriate health decisions, which contributes to poor health outcomes, reduces the use of healthcare services, and adds to the difficulties in managing chronic or acute illnesses.

Adult literacy learners represent a highly diverse population [5]. Socially and economically, adults with low literacy skills are more common among women, the elderly, racial/ethnic minorities, and those with low educational attainment and socioeconomic status. They also have different educational backgrounds, learning disabilities, first languages (English or other) as well as motivation for taking part in adult literacy courses [16]. Adult learners often face significant barriers to consistent attendance in traditional, in-person literacy programs. These barriers include unstable work schedules, transportation challenges, and childcare needs [1, 11, 15]. Consequently, traditional face-to-face programs struggle to accommodate the diverse needs and circumstances of this population.

Intelligent tutoring systems that deliver well-constructed literacy instructions online can potentially address the difficulties the adult learners face and accommodate different kinds of needs. Intelligent tutoring systems are fundamentally grounded in theoretical frameworks derived from cognitive psychology, education, and the learning sciences [8]. These systems employ advanced algorithms to deliver personalized learning experiences by recommending tailored content, strategies, and learning pathways. Such recommendations are dynamically adjusted according to individual learners' current knowledge levels, specific needs, learning objectives, aptitudes, and, in some cases, even their personality traits. AutoTutor for Adult Reading Comprehension (AT-ARC hereafter) is such an intelligent tutoring system we developed to assist adult learners with low literacy skills in enhancing their deep levels of reading comprehension in the English language [16]. The system was web-based and designed to support adult learners whose reading proficiency ranges from grade levels 3.0 to 8.0 or their equivalent.

AT-ARC Lessons

AT-ARC is an online intelligent tutoring system designed to enhance adult learners' reading comprehension skills. The system was deployed using a learning management system on the Internet to ensure public accessibility. The system employs two computer agents—a tutor agent (Cristina) and a peer agent (Jordan)—to deliver 30 lessons through trialogues, where the agents engage in conversations with the learner and each other [7]. Each lesson focuses on one or more reading skills grounded in a theoretical model of comprehension [16].

The 30 lessons are structured into instruction and practice sections. Learners begin with a mini-lecture introducing the targeted reading skill, followed by practice activities involving multiple-choice questions related to words, sentences, texts, or visual elements (e.g., text styles and images) [16]. The number of questions per lesson ranges from 6 to 30. In most lessons, if a learner provides an incorrect or incomplete answer, they receive hints from one of the agents, offering a second attempt with additional guidance. Completing a lesson typically takes 20 to 50 minutes. The lessons are organized into three categories based on the modality of the learning materials: (1) words and sentences, which focus on word decoding, identification, and syntax; (2) computer and internet, which cover skills such as filing job applications, sending emails, searching for information, and interacting on social media; and (3) stories and texts, which teach deep reading comprehension strategies for lengthy entertaining, informative, or persuasive texts. For further details, please see [16].

The Present Study

The literacy program led by our team consisted of two primary components: teacher-led instruction and an AT-ARC module. The AT-ARC sessions were aligned with the teacher-led instruction, ensuring that both components addressed foundational and advanced reading skills. Foundational skills encompassed morphology, word decoding, vocabulary, and syntax, while advanced skills focused on deeper comprehension strategies [6]. However, by closely examining our data, we found that a significant proportion of adult learners failed to complete all the AT-ARC lessons assigned to them by the teacher in each class. The attrition rates of adult learners posed significant challenges, limiting the program's effectiveness and reducing the potential benefits for adult learners themselves.

In the present study, we employed the Cox proportional hazards model, a widely used method in survival analysis, to examine the factors that may predict the dropout rates among adult learners in the literacy program. Such an analysis was expected to deepening our understanding of the adult learners’ learning behaviors and characteristics, thereby informing refinements to future intelligent tutoring system (ITS)-based adult literacy programs. Survival analysis is a statistical approach designed to analyze time-to-event data, where the outcome of interest is the time until an event occurs [4]. In this study, the event of interest was defined as the dropout of adult learners from the AT-ARC intervention. Dropout, in this context, did not signify a complete cessation of participation in the program. Rather, it was operationalized as the occurrence of three or more consecutive absences from AT-ARC lessons, indicating a substantial gap in the learners' engagement and a critical loss of continuity in their development of reading comprehension skills. The time to dropout was operationalized as the number of AT-ARC lessons completed by the learners prior to the dropout. The potential factors under consideration included adult learners’ persistence, prior knowledge of reading comprehension, receipt of public assistance, age, first language, and others. The Cox proportional hazards model provided insights into which factors significantly reduced the hazard of dropout among adult learners.

methods

Participants

The 205 participants were recruited from literacy classes in Metro-Atlanta (n=114) and Metro-Toronto (n=91). The ages of participants varied from 18 to 73 with a mean of 41.6 (SD= 13.2). The majority (78%) of the participants were female. About 60% participants reported their race or ethnicity as Black, African American, Black Canada, or Mixed-race with African or Black descent. For simplicity, we divided the participants into two ethnic groups: Black and Other. All participants read from 1.5 to 11.3 grade levels (M=3.47, SD= 1.51) measured by Woodcock–Johnson III Passage Comprehension [14]. Among the participants, around 50% reported their first language was English, about 57% reported they acquired English at 1 to 4 years old, 7% at 5 to 10 years old, 9% at 11 to 15 years old, 10% at 16 to 20 years old, 17% at 21 years old and above. We excluded the variable “Age of English Acquisition” from data analysis because the number of participants whose first language was English was almost the same as those who acquired English at 1 to 5 years old, and the sample sizes were small in other age categories. Additionally, 62% of participants received public assistance at some point. The public assistance included, but was not limited to Food Stamps, Food Bank, Temporary Assistance for Needy Families (Ontario Works), Temporary Aid to Needy Families (TANF), National Child Benefit, Child Disability, Retirement or Income Support from Ontario Disability Support Program (ODSP), and others.

Procedures

The study was conducted over three waves of a formal intervention after a feasibility study was conducted with minor adjustments implemented between data collection cycles to refine the AT-ARC and the intervention protocol. The intervention spanned from December 2015 to June 2017, with each wave lasting approximately four months. All waves adhered to a consistent procedural framework.

Prior to the intervention, participants’ self-report data were collected, including their demographic information and their status regarding receipt of public economic assistance. Then they completed pretests to assess their baseline reading skills. During the intervention, participants were assigned to different adult literacy classes, which included two primary components: teacher-led instruction and an AT-ARC module. Each class consisted of approximately 15 adult learners, with each class following its own arrangement for the AT-ARC sessions. Consequently, participants in 14 classes may have completed varying numbers of AT-ARC lessons, ranging from 19 to 26 with a mean of 23.2. During each class, teachers assigned specific AutoTutor lessons to be completed on designated days. The AT-ARC sessions were generally aligned with the teacher-led component; however, when time constraints arose, the AutoTutor segment was completed at the beginning of the subsequent session. Each session lasted between 1.5 and 3 hours, with 2–3 sessions conducted per week.

Following the four-month intervention, participants completed posttests to evaluate their reading comprehension skills again. Across all three waves, the Woodcock-Johnson (WJ) III Passage Comprehension test [17] was administered during both pretests and posttests.

Measures

To assess participants’ reading skills in pre- and post-tests, the Woodcock-Johnson III (WJ-III) Passage Comprehension subtest [17] was administered by a trained human examiner. The test items consisted of short texts, each comprising one or two sentences with a missing word indicated by a blank. Participants were instructed to read each item silently and verbally provide the missing word to complete the sentence. Testing proceeded sequentially through the items in the assessment booklet until the participant provided incorrect responses to six consecutive items. The performance of adult learners on the WJ-III Passage Comprehension subtest was standardized and converted to grade-level equivalents, ranging from 0 to 12 [17]. The learning gains of the participants were measured by the difference between their grade levels of post- and pre-tests.

Upon closer examination of the data, we observed that adult learners frequently repeated individual AT-ARC lessons multiple times. This repetition occurred partly due to technical issues (e.g., system crash) and partly because learners voluntarily chose to or were recommended to review the lessons for their own benefit. We hypothesized that the observed repetition behavior may reflect adult learners’ short-term persistence in the learning process. Thus, we computed the average number of attempts of all the AT-ARC lessons for each participant before he/she dropped out of the AT-ARC intervention.

Dropout, in this study, did not signify a complete cessation of participation in the program. Rather, it was operationalized as the occurrence of three or more consecutive absences from AT-ARC lessons, indicating a substantial gap in the learners' engagement and a critical loss of continuity in their development of reading comprehension skills. In the survival analysis lingo, a learner was classified as uncensored (coded as 1 in survival analysis) if they dropped out of the AT-ARC intervention. Dropout is the event of interest in this survival analysis. Conversely, a learner was classified as censored (coded as 0 in survival analysis) if they completed all assigned AT-ARC lessons by the end of their class.

Survival analysis involves the examination of data representing the duration from a well-defined time origin (start point) until the occurrence of a specific event (endpoint). In this study, the time origin was defined as the point at which adult learners began their first AT-ARC lesson. The time-to-event (dropout) was operationalized as the number of AT-ARC lessons completed by an adult learner prior to the dropout. This operationalization was appropriate because the literacy classes were conducted regularly, with two to three sessions per week. If an adult learner did not experience the dropout event, it was assumed that they completed all lessons assigned to them during the intervention period (see section 2.2).

Let $T_{1},T_{2},\ \ldots,\ T_{n}$ be time to event (dropout) of $n$ independent participants. There is a maximum observation period (number of lessons assigned to adult learners during the intervention) for each individual $C_{1},C_{2},\ \ldots,\ C_{n}$ .

The status of a participant ( $\gamma_{i}$ ) is an Indicator (or dichotomous) Variable, represented as:

$\gamma_{i} = \left\{ \begin{matrix} 1,\ \ T_{i} < C_{i}\ \ \ \ \ \ \ \ \ \ \ uncensored \\ 0,\ \ otherwise\ \ \ \ censored \\ \end{matrix} \right.\$

where Time ( $T_{i}$ ) to Event is independent of Censoring ( $C_{i}$ )

Additional measures included participants’ gender, ethnic identity, age, first language, age of English language acquisition, and receipt of public assistance. For analytical purposes, participants were categorized into two groups based on ethnic identity: Black and Other, as preliminary analyses indicated a lower risk of dropout among participants of Other ethnicity. To address confounding between age of English language acquisition and first language, we retained only the first language variable, classifying it as either English or Other. Public assistance receipt was coded binarily as “Yes” or “No”.

Data Analyses

Descriptive and Correlational Analysis

Descriptive analyses were conducted to provide an overview of the distribution of censored and uncensored participants across different categories of covariates, such as gender, ethnic identity, first language, receipt of public assistance. We also calculated the average number of attempts, average age, and average of grade levels of censored and uncensored participants.

Next, we computed the correlation coefficients between number of attempts, age, grade level and time-to-event (dropout).

Student’s T Test

To examine the impact of dropout on adult learners’ learning gains, a one-tailed Student’s t-test was conducted to assess the discrepancy in learning gains between censored and uncensored participants.

Survival Analysis

To examine the factors influencing the time-to-event outcome, we employed the Cox proportional hazards model (CoxPH), a widely used method in survival analysis [2]. The CoxPH model is a semi-parametric approach that estimates hazard ratios (HR), which represent the relative risk of the event occurring at any given time based on predictor variables. This method allows for the inclusion of both continuous and categorical covariates, making it well-suited for exploring the effects of demographic, psychological, and contextual factors on the event of interest. By applying the CoxPH model, we aimed to identify significant predictors of the event (dropout) and assess their impact on the hazard rate, providing insights into the underlying mechanisms and informing targeted interventions.

The Cox proportional hazards model is expressed mathematically as:

$h(t,X) = h_{0}(t) \cdot exp(\beta_{1}X_{1} + \beta_{2}X_{2} + \cdots + \beta_{p}X_{p})$

Where:

$h(t,X)$ : The hazard function at time t for an individual with covariate values $X = (X1,X2,\ldots,Xp)$ . This represents the instantaneous risk of the event occurring at time t, given that the individual has survived up to that time.

$h_{0}(t)$ : The baseline hazard function, which describes the hazard when all covariates are zero (or at their reference levels). It is unspecified and left non-parametric in the Cox model.

$exp(\beta_{1} X_{1} + \beta_{2} X_{2} + \cdots + \beta_{p} X_{p})$ : The exponential term captures the effect of the covariates $X_{1},\ X_{2},\ldots,X_{p}$ on the hazard. Each $\beta_{i}$ represents the log hazard ratio associated with a one-unit increase in the covariate $X_{i}$ , holding other covariates constant.

$\beta_{1},\ \beta_{2},\ldots,\ \beta_{p}$ : The regression coefficients to be estimated from the data. These coefficients quantify the impact of each covariate on the hazard.

The key assumption of the CoxPH model is the proportional hazards assumption, which states that the hazard ratio between any two individuals is constant over time. This implies that the effect of covariates on the hazard is multiplicative and does not change with time.

The CoxPH model was implemented using statistical software R, where the coxph function in the survival package is commonly used to fit the model and estimate the effects of covariates on survival outcomes [2]. The coxph function allows users to specify a formula with a survival object (created using the Surv() function) on the left-hand side and covariates on the right-hand side. The model in this study can be specified as

model <- coxph(Surv(time, event) ~ attempts + grade level + gender + age + ethnicity + first language + public assistance, data = sing dataset).

It provides hazard ratios, confidence intervals, and p-values for the covariates, offering insights into their impact on survival. Additionally, the function supports advanced features like handling tied data, stratification, and time-dependent covariates, making it a powerful and versatile tool for survival analysis in R.

Next, we tested the proportional hazards assumption using the Schoenfeld test implemented by cox.zph function in survival package. The cox.zph function was specified as

schoenfeld_test <- cox.zph(model)

Finally, we utilized the ggforest function from the survminer package in R to visually represent the results of the Cox proportional hazards model. The ggforest function generated a forest plot, which effectively displays the hazard ratios, confidence intervals, and statistical significance of the covariates included in the CoxPH model, as estimated by the coxph function. This visualization aids in the interpretation of the model's findings by providing a clear and concise summary of the relationships between predictors and the outcome of interest (see section 3). The ggforest function was specified as

ggforest(model, data= dataset).

results

Descriptive Statistics and Correlations

Among the 205 participants, 99 completed all the AT-ARC lessons assigned during the intervention, while 106 were considered to have dropped out. In survival analysis terminology, the 99 participants who completed the intervention were classified as censored (coded as 0), and the 106 participants who dropped out were classified as events (uncensored, coded as 1).

Table 1 presents the descriptive statistics of the censored and uncensored participants. The Attempts (average number of repetitions on AT-ARC lessons before dropout), WJ-III (prior knowledge in grade level), and Age are continuous variable, we calculated their means and standard deviations. Table 2 shows that, on average, censored participants repeated lessons more frequently (M = 1.51, SD = 0.53) than uncensored participants (M = 1.39, SD = 0.53), who were classified as having dropped out of the intervention. Censored participants (M = 3.54, SD = 1.54) possessed a slightly higher grade level compared to uncensored participants (M = 3.41, SD = 1.49). Censored participants (M = 42.8, SD = 12.6) were, on average, 2.4 years older than uncensored participants (M = 40.4, SD = 13.6).

Gender, Ethnicity, First Language (the initial language acquired by participants), and Public Assistance (whether participants received public aid) were categorical variables. We calculated the frequency of censored and uncensored participants within each category. Table 2 shows that there was no significant difference between the two genders, with both groups having essentially the same percentages of uncensored participants (Female: 51.3%, Male: 53.1%). Among ethnic groups, Black participants (52.4%) exhibited a slightly higher tendency to drop out of the intervention, whereas participants from other ethnic groups did not show such differences in dropout rates (50.6%). Participants with non-English as their first language exhibited a slightly higher dropout rate (53.9%), whereas participants with English as their first language did not show such difference (50.5%). Participants who received public assistance had a lower dropout rate (44.6%), whereas those who did not receive public assistance exhibited a higher dropout rate (63.0%).

Since the censoring criteria varied across participants in different classes, calculating the mean time-to-event (the number of lessons participants completed prior to dropout) for categories of the categorical variables (Gender, Ethnicity, First Language, and Public Assistance) was not meaningful.

Therefore, we only calculated the correlations between time-to-event and the three continuous variables (Attempts, WJ-III, and Age) to explore their potential relationships. We found that, among the three continuous variables, only the number of attempts showed a significant correlation (r = 0.23, p=0.001) with time-to-event.

Table 1. Descriptive Statistics of Censored and Uncensored Participants
Variable		Censored	Uncensored
Attempts	Mean (SD)	1.51 (0.53)	1.39 (0.53)
WJ-III	Mean (SD)	3.54 (1.54)	3.41 (1.49)
Age	Mean (SD)	42.8 (12.6)	40.4 (13.6)
		Frequency
Gender	Female	76	80
Gender	Male	23	26
Ethnicity	Black	59	65
Ethnicity	Other	40	41
First Language	English	52	51
First Language	Other	47	55
Public Assistance	Yes	72	58
Public Assistance	No	27	46

To validate the findings from the descriptive statistics and correlation analyses, inferential statistical analyses were conducted. We incorporated these variables into a Cox proportional hazards model with a survival object constructed from time-to-event and event status data as dependent variable, aiming to figure out which factors contribute to the hazard of dropout. Before that, we performed a Welch Two Sample t-test to examine the impact of dropout status on learning gains.

Impact of Dropout on Learning Gains

The results of the one-tailed Welch Two Samples t-test showed that censored participants (M = 0.66, SD = 1.06) had a significantly higher learning gain compared to uncensored participants (M = 0.39, SD = 1.19) at the significance level of 0.1, with t(157.7) = 1.59, p = 0.06, and effect size d = 0.25. Although the effect size is small, there is a meaningful difference between the learning gains of censored participants and uncensored participants.

Results of CoxPH Model

Based on the Schoenfeld test, the p-values for all individual covariates and the global test are greater than 0.05 (see Table 2). Therefore, we did not find significant evidence to reject the null hypothesis that the proportional hazards assumption holds for the Cox proportion hazards model with our dataset.

Table 2. Results of Schoenfeld Test for the Proportional Hazards Assumption
	χ²	df	p
Attempts	1.196	1	0.274
WJ-III	2.339	1	0.126
Gender	0.948	1	0.33
Age	0.846	1	0.358
Ethnicity	2.202	1	0.138
First Language	0.374	1	0.541
Public Assistance	2.107	1	0.147
GLOBAL	12.481	7	0.086

The results of the Cox proportional hazards (CoxPH) model indicated that 7 observations were excluded from the analysis due to missing data, resulting in a final sample size of 198 participants, with 105 observed events (dropouts). The overall model fit was significant (likelihood ratio test: χ² = 16.2, df = 7, p = 0.02). The concordance index was 0.63, indicating moderate predictive accuracy. The results suggested that the included predictors collectively explain a meaningful portion of the variance in the outcome.

Significant predictors included Attempts (p < 0.001), Ethnicity (p < 0.05) and Public Assistance (p < 0.05). Participants who repeated the AT-ARC lessons more frequently (more attempts) before dropout tended to have lower hazard of dropping out of the intervention (HR = 0.55, 95% CI [0.35, 0.86]), indicating that each additional attempt was associated with a 45% reduction in the hazard of the dropout. Participants of "Other" ethnicities had a significantly lower hazard of the dropout compared to Black participants (HR = 0.62, 95% CI [0.39, 0.98]), indicating that being of an ethnicity other than Black was associated with a 38% reduction in the hazard of dropout. Similarly, receiving public assistance had a significantly lower hazard of dropout compared to those not receiving public assistance (HR = 0.61, 95% CI [0.40, 0.94]), indicating a 39% reduction in the hazard of dropout when receiving public assistance.

Non-significant predictors included WJ-III scores (HR = 0.99, 95% CI [0.85, 1.15]), Gender (HR = 0.90, 95% CI [0.56, 1.43]), Age (HR = 0.99, 95% CI [0.98, 1.01]), and First Language (HR = 1.22, 95% CI [0.77, 1.93]).

These findings suggest that participants’ repetition times, ethnicity and public assistance status are significant factors influencing the hazard of the dropout, while other variables such as prior knowledge, gender, age, and first language did not show significant associations.

Forest plot for Cox proportion hazard model with Attempts, WJ-III, Gender, Age, Ethnicity, First Language and Public Assistance as the predictors to predict the hazard rate of dropout. — Figure 1. Forest plot for Cox proportion hazard model

Note. WJ-III represents the grade levels of participants at pre-test. The values in the third column denote hazard ratios along with their corresponding 95% confidence intervals for the covariates. Significance levels are indicated as *p < .05 and **p < .01. We reported the Global p-value of likelihood ratio test other than the Log-Rank p-value. AIC was also not used.

Discussion

Despite 34 participants not completing the posttests, a Welch two-sample t-test comparing censored and uncensored participants revealed a marginally significant effect of attrition on adult learners’ learning gains (p=.06), suggesting that dropout may have influenced their learning outcomes. It is plausible that, had all participants completed the posttests, the observed group differences might reach conventional statistical significance and yield a larger effect size, potentially reflecting attrition-related bias in the observed effects. This finding provided us with the rationale to examine the factors that may influence the dropout rates.

The findings from the Cox proportional hazards model revealed significant associations between certain predictors and the time-to-event outcome. Specifically, Attempts, Ethnicity and Public Assistance emerged as significant factors influencing the hazard of the dropout.

The variable Attempts represented adult learners’ average number of repetitions on AT-ARC lessons before dropout which may reflect adult learners’ short-term persistence in the learning process. Persistence can be defined as an individual's continued or repeated action toward a goal, even in the face of difficulties, interruptions, or setbacks [3, 12]. During the AT-ARC intervention, adult learners encountered challenges such as system crashes, difficulties in comprehending reading materials, mastering reading skills, and scheduling conflicts. Despite these obstacles, they demonstrated persistence by returning to their learning activities, underscoring their commitment to improving reading comprehension skills. The repetition of a single lesson may be partly attributed to technical issues (e.g., system crashes) and partly to learners’ voluntary decisions to review the material for their own benefit. This suggests that a higher frequency of repetitions reflects stronger short-term persistence in their learning efforts. As the results of the Cox proportional hazards (CoxPH) model suggested, short-term persistence may significantly reduce the hazard of dropout.

Participants identifying as "Other" ethnicities exhibited a 38% lower the hazard of dropout compared to Black participants (HR = 0.62, 95% CI [0.39, 0.98]), indicating that Black participants were at a higher risk of attrition. This finding warrants further investigation to elucidate the underlying factors contributing to this disparity. Additionally, future literacy program designers should prioritize tailoring interventions to better address the specific needs and characteristics of Black learners to enhance engagement and retention.

Similarly, individuals receiving public assistance exhibited a 39% reduction in hazard compared to those not receiving assistance (HR = 0.61, 95% CI [0.40, 0.94]). This finding suggests that socioeconomic support systems can mitigate risks associated with adult learners’ dropout from literacy programs.

In contrast, other variables, including WJ-III, Gender, Age, and First Language, did not show statistically significant associations with the hazard of dropout. For instance, neither WJ-III scores nor age demonstrated a significant influence on adult learners’ dropout, indicating that prior literacy skills and age were not associated with the likelihood of attrition among adult learners. Males had a slightly lower hazard of dropout compared to females, but this difference was far from significant. Adult learners whose first language was not English (“Other”) exhibited a higher hazard of dropout rate, but this difference did not reach the significance level. The lack of significance for these variables may reflect limited variability in the sample or the possibility that these factors are less relevant to the event in this context.

conclusion

This study examined critical predictors influencing attrition rates among adult learners in an ITS-based adult literacy program, AT-ARC, with the concern that attrition may significantly impact adult learners’ learning gains. The CoxPH model in this study revealed that short-term persistence (lesson repetition), ethnicity, and receipt of public assistance significantly influenced the attrition rate among the adult learners. These findings offer critical implications for refining future Intelligent Tutoring System (ITS)-based adult literacy programs. First, fostering short-term persistence—such as encouraging lesson repetition and review—may enhance long-term learning outcomes. This suggests that ITS platforms should integrate instructional design features (e.g., adaptive prompts for revision) and prioritize technical robustness to minimize disruptions and sustain engagement. Second, programs must adopt culturally responsive frameworks tailored to the specific needs of Black learners, who faced elevated attrition risks, to address systemic inequities and improve retention. Third, program administrators should collaborate with social services to connect learners with public assistance resources, thereby mitigating socioeconomic barriers that compete with learning time. These findings, derived from exploratory data analysis, highlight the need for future experimental or longitudinal studies to rigorously examine the causal mechanisms underlying these factors.

ACKNOWLEDGMENTS

The research reported here was supported by the National Science Foundation of China, through grant #62367001, the Institute of Education Sciences, US Department of Education, through grants R305C120001 and R305A200413, and the National Science Foundation under the award The Learner Data Institute (award #1934745). The opinions expressed are those of the authors and do not represent views of the National Science Foundation of China, the Institute or the U.S. Department of Education, and the National Science Foundation.

REFERENCES

Alamprese, J. A., MacArthur, C. A., Price, C., & Knight, D. 2011. Effects of a structured decoding curriculum on adult literacy learners’ reading development. Journal of Research on Educational Effectiveness, 4, 154-172. DOI=https://doi.org/10.1080/19345747.2011.555294
Andersen, P. and Gill, R. 1982. Cox's regression model for counting processes, a large sample study. Annals of Statistics, 10, 1100-1120. DOI= https://doi.org/10.1214/aos/1176345976
Brandstätter, V., & Bernecker, K. 2022. Persistence and disengagement in personal goal pursuit. Annual Review of Psychology, 73(1), 271-299.
Clark, T., Bradburn, M., Love, S. et al. 2003 Survival Analysis Part I: Basic concepts and first analyses. Br J Cancer, 89, 232–238. DOI= https://doi.org/10.1038/sj.bjc.6601118
Elish-Piper, L. 2007. Defining adult literacy. In B. J. Guzzetti (Ed.) Literacy for the new millennium: Vol. 4. Adult literacy. Praeger, Westport, Connecticut. 3-16.
Fang, Y., Lippert, A., Cai, Z., Chen, S., Frijters, J. C., Greenberg, D., & Graesser, A. C. 2022. Patterns of adults with low literacy skills interacting with an intelligent tutoring system. International Journal of Artificial Intelligence in Education, 32, 297-322. DOI=https://doi.org/10.1007/s40593-021-00266-y
Graesser, A. C., Cai, Z., Morgan, B., & Wang, L. 2017. Assessment with computer agents that engage in conversational dialogues and trialogues with learners. Computers in Human Behavior, 76, 607-616. DOI=https://doi.org/10.1016/j.chb.2017.03.041
Graesser, A. C., Conley, M. W., & Olney, A. 2012. Intelligent tutoring systems. APA educational psychology handbook, Vol 3: Application to learning and teaching. American Psychological Association, Washington, DC. 451-473
Greenberg, D. 2008. The challenges facing adult literacy programs. Community Literacy Journal, 3, 39–54.
Greenberg, D., Feinberg, I.Z. 2019. Adult literacy: a perspective from the United States. Z Erziehungswiss 22, 105–121. DOI=https://doi.org/10.1007/s11618-018-0853-8
Miller, B., Esposito, L., & McCardle, P. 2011. A public health approach to improving the lives of adult learners: Introduction to the special issue on adult literacy interventions. Journal of Research on Educational Effectiveness, 4, 87-100. DOI=https://doi.org/10.1080/19345747.2011.555287
Moshontz, H., & Hoyle, R. H. 2021. Resisting, recognizing, and returning: A three-component model and review of persistence in episodic goals. Social and personality psychology compass, 15(1), e12576.
OECD. 2013. OECD skills outlook 2013: first results from the survey of adult skills. OECD Publishing, Paris.
Rampey, B.D., Finnegan, R., Goodman, M., Mohadjer, L., Krenzke, T., Hogan, J., & Provasnik, S. 2016. Skills of U.S. unemployed, young, and older adults in sharper focus: results from the Program for the International Assessment of Adult Competencies (PIAAC) 2012/2014: first look (NCES 2016- 039rev). U.S. Department of Education. http://nces.ed.gov/pubsearch. Accessed 20 Feb. 2025. Washington, DC: National Center for Education Statistics.
Sabatini, J. P., Shore, J., Holtzman, S., & Scarborough, H. S. 2011. Relative effectiveness of reading intervention programs for adults with low literacy. Journal of Research on Educational Effectiveness, 4, 118-133. DOI=https://doi.org/10.1080/19345747.2011.555290
Shi, G., Wang, L., Zhang, L., Shubeck, K., Peng, S., Hu, X., & Graesser, A. C. 2021. The Adaptive Features of an Intelligent Tutoring System for Adult Literacy. In International Conference on Human-Computer Interaction. Cham: Springer International Publishing. 592-603. DOI=https://doi.org/10.1007/978-3-030-77857-6_4
Woodcock, R. W., McGrew, K. S., & Mather, N. 2001. Woodcock-Johnson tests of achievement III (WJ-III). Riverside Publishing, Rolling Meadows, IL.