ABSTRACT
We consider the equity and fairness of curricula derived from Knowledge Tracing models. We begin by defining a unifying notion of an equitable tutoring system as a system that achieves maximum possible knowledge in minimal time for each student interacting with it. Realizing perfect equity requires tutoring systems that can provide individualized curricula per student. In particular, we investigate the design of equitable tutoring systems that derive their curricula from Knowledge Tracing models. We first show that the classical Bayesian Knowledge Tracing (BKT) model and their derived curricula can fall short of achieving equitable tutoring. To overcome this issue, we then propose a novel model, Bayesian-Bayesian Knowledge Tracing (B2KT), that naturally allows online individualization. We demonstrate that curricula derived from our model are more effective and equitable than those derived from existing models. Furthermore, we highlight that improving models with a focus on the fairness of next-step predictions can be insufficient to develop equitable tutoring systems.
Keywords
1. INTRODUCTION
In recent years Massive Open Online Courses (MOOCs) and online educational platforms have gained significant importance. They hold the opportunity of providing education at scale and making education accessible to a larger part of the world’s population. To facilitate learning in online education and enable customized learning paths for all students, intelligent tutoring systems can be employed while limiting the amount of manual work necessary for each student [11].
In that context, moving education from an offline setting to an online setting, has the potential to promote Inclusion, Diversity, Equity, and Accessibility (IDEA). In particular, by reducing personnel efforts for tutoring, there is the opportunity to include students with diverse backgrounds and skills, and, importantly, to support their learning equitably. To achieve this, an intelligent tutoring system must be able to adapt to the specific characteristics of each student.
While individualized tutoring has been studied in the community for many years, we consider individualization with a focus on equitable and fair tutoring in this paper. We start by providing a unifying definition of an equitable tutoring system. Our definition is based on the ethical principles of beneficence (“do the best”) and non-maleficence (“do not harm”) which are commonly adopted in bioethics and medical applications [1]. These principles dictates that we should provide tutoring which maximizes the achieved knowledge while minimizing a student’s efforts. In particular we focus on modifying Bayesian Knowledge Tracing (BKT) [2] to better realize these ethical principles. To this end, we propose the Bayesian-Bayesian Knowledge Tracing (B2KT) model and demonstrate its advantages for equitable tutoring in several experiments. Furthermore, we investigate the relation of the commonly considered AUC score concerning the derived tutoring policies, finding that even if a BKT model appears fair in terms of the AUC score, the derived tutoring policies can be inequitable.
In summary, we make the following contributions: (i) We propose a unifying definition of equitable tutoring motivated by ethical principles. (ii) We propose the B2KT model which allows for effective individualization and demonstrate its benefits concerning equitable tutoring. (iii) We highlight that focusing on equity in terms of AUC can be insufficient to ensure equitable tutoring in terms of our definition.
An longer version of this paper with additional experimental results and extended discussion is available [15].
2. RELATED WORK
Fairness in online education and BKT. Several works have considered fairness in data-driven educational systems and intelligent tutoring, e.g., [7, 4, 17, 8]. In [7], the authors discussed implications of using data-driven predictive models for supporting education on fairness. They identified sources of bias and discrimination in “the process of developing and deploying these systems”, and discussed high-level possibilities to improve fairness of systems in the “action step”. In [8, 17], it was investigated how different data sources can provide helpful information to predict students’ success in education. Key insights were that different data sources can help to make better predictions but have different characteristics in whether they over- or underestimate students’ success [17], and that such predictions can include gender and racial bias in some fairness measures which can be partly alleviated through post-hoc adjustments [8]. In [4] fairness in the context of BKT was studied, and it was found that tutoring policies basing on inaccurate BKT models can be inequitable, when considering the difference in learning success for different subpopulations as a measure of unfairness. Related work also considers adopting a Bayesian perspective for realizing fair decision rules under model uncertainty [3] and fairness in the context of non-i.i.d. data [19].
Individualization in BKT. Several papers have studied individualization of BKT models per student, e.g., [9, 10, 18]. In [10] the prior per student model was introduced which uses a student-specific parameter characterizing the students’ individual knowledge. [18] considered individualization through defining student and skill specific parameters which are fitted through gradient descent.
Instructional policies. Key for achieving equity according to our definition are instructional policies which stop practicing a skill at the right time. This problem has for instance been considered in [6, 12]. Further related work has investigated approaches leveraging deep models for creating policies to quickly assess students’ knowledge [16] and using reinforcement learning for optimizing tutoring policies [14, 5].
3. BACKGROUND & NOTATION
Bayesian Knowledge Tracing. Bayesian knowledge tracing (BKT) [2] is a model characterizing the skill acquisition process of students. For a single skill, it can be understood as a standard hidden Markov model in which the binary (latent) state encodes the mastery of the skill, and the binary observations indicate whether a practicing opportunity of the skill was solved correctly. Upon practicing a not yet mastered skill, the student acquires the skill with probability . Once a skill is mastered, it remains mastered. If a student has mastered the skill practiced by an exercise, they solve this exercise correctly with probability . If a student has not mastered the skill, it guesses the correct answer with probability . At the beginning, a student has already mastered the skill with probability .
Notation. We consider the interaction of students with an intelligent tutoring system. The interaction history up to time is denoted as , where is the skill practiced through an exercise at time , is an indicator of whether the exercise was solved correctly, and is the set of skills. In the context of BKT, we refer to the random variables (RVs) indicating whether skill is mastered at time as and to the RVs indicating whether an exercise practicing that skill would be solved correctly as . Sometimes we add another superscript to indicate the student the RVs correspond to. Upper-case terms like denote RVs and their lower-case counterparts like denote particular instantiations.
4. EQUITABLE TUTORING
In this section, we provide a definition of equity in intelligent tutoring and discuss its operationalization.
4.1 Definition
We consider a tutoring setting in which a total of sills ought to be taught to a set of students by an intelligent tutoring system employing a tutoring policy . This policy maps histories consisting of observations of a student’s learning process to an exercise to be practiced next or to a stop-action , which ends the teaching process. Each student can have different learning characteristics. Every tutoring policy has an expected stopping time , i.e., the expected time of executing the stop action, and an expected knowledge acquired by the end of the teaching process, i.e., is the expected number of mastered skills upon executing the stop action.
Our notion of equity is based on the ethical principles of beneficence and non-maleficence. We understand them to translate into the objective of maximizing a student’s knowledge using as little of the student’s resources as possible, i.e., performing a minimal number of exercises:
Definition 1. Consider a tutoring system employing a tutoring policy . The policy is equitable for student iff
A tutoring system is equitable if its tutoring policy is equitable for all students .
Thus, informally, a tutoring system is equitable if it can teach all skills in the minimal amount of time possible to any student. Note that our notion of equity is strongly related to that introduced in [4] (cf. discussion below). In the above definition, we implicitly assume that all students can master all skills.1 Importantly, a tutoring system can only be equitable if it is adaptive to the students which are interacting with it. In particular, it has to individualize the assignment of exercises and needs to carefully select the "stop action", in order to achieve equity. The above definition describes an idealized notion of equity which in general cannot be achieved as the tutoring policy would have to teach using the optimal policy right from the beginning. Nevertheless, we can compare tutoring policies in the spirit of the above definition. In particular, given two tutoring policies and which both teach the same number of skills, we consider the policy to be more equitable as compared to if for all students it holds that
We note that our notion of equity is strongly related to that introduced in [4]. In [4], the authors “assume that an equitable outcome is when students from different demographics reach the same level of knowledge after receiving instruction”. The desideratum of achieving knowledge fast is later also added to their notion of equity whereas in our case it is a fundamental constituent. Furthermore, our interest extends to downstream implications of such a definition of equity, namely the individualization of knowledge tracing.
Theoretical Implications. Our definition of equity leads to the following (probably obvious) but important observation:
Observation 1. A tutoring system for a population of students with different learning characteristics can only be equitable if its tutoring policy is adaptive to the students.
Thus, we note that if the tutoring policy is deriveddeterministically from a non-adaptive, initially incorrect, model of the students, the tutoring system will in general not be equitable. Achieving equity would require basing a policy on rich side information in order to employ an optimal tutoring policy for each student right from the beginning. But such rich side information might not be available.
4.2 Operationalization
Tutoring policies are often either simple fixed strategies or derived from a model, e.g., a BKT model, such that each knowledge component is repeatedly exercised until it is mastered with a certain probability. But tutoring policies based on incorrect or non-adaptive models can result in a student not acquiring all skills or suggest to perform too many practicing opportunities. Thus the following two general directions are important for building equitable tutoring systems: (i) Using side information. Any available side information about a student should be used to individualize the underlying models. In the context of classical BKT models, the side information could be used to make an initial guess about the key parameters of the model (). (ii) Online adaptation. Even when using side information, a model is likely not perfectly individualized to all students. To further adjust the models in such cases, online adaption of the models during interaction seems promising.
5. PROPOSED APPROACH: B2KT
In this section, we propose a Bayesian variant of the classical BKT model which enables online adaption to student’s parameters from which individualized — potentially more equitable — policies can be derived, cf. Figure 1.
We assume that each student has its own learning dynamics, described by student-specific parameters . If the learning dynamics can be described using a BKT model, . We assume these learning dynamics to apply for the acquisition of all skills. In practice, we don’t know these parameters and need to infer them. To this end, we take a Bayesian approach, and we assume a set of possible parameters such that and a prior distribution . Based on observations of a student’s practicing exercises collected in , we can compute the probability that a student has mastered a specific skill and base tutoring policies thereon. As we don’t know , this requires marginalizing out the (unknown) parameters . In this way the different possible parameters and their influence for predicting the knowledge state get re-weighted according to the available data. In particular, we compute
where is a random variable indicating whether skill is mastered at time by student . For only a few possible parameters , the above equation can be solved exactly by enumeration and by observing that both terms and can be computed efficiently by the following recursion:
Here collects all observations with respect to practicing the th skill up to time , and is the th entry of . Then
6. EXPERIMENTS
We perform experiments on synthetic data and consider settings in which the learning rate is assumed to be unknown. This is motivated by previous work which has identified the learning rate as a key parameter for improving BKT based models [18]. In all presented results we denote the average stopping time of a policy for a population of students by and the average number of acquired skills by . We consider Threshold() curricula based on knowledge tracing models. These curricula repeatedly exercise a skill until it is mastered with a probability of at least under the model. We consider the following models: (i) BKT: the classical BKT models with fixed parameters; (ii) B2KT: the proposed Bayesian-BKT model.
1 skill | 5 skills | 20 skills
| ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
slow learners | fast learners | slow learners | fast learners | slow learners | fast learners
| |||||||
Threshold(0.95) | % skills | % skills | % skills | % skills | % skills | % skills | ||||||
BKT slow | 97.00 | 24.14 | 99.50 | 9.49 | 97.20 | 122.80 | 99.90 | 66.00 | 97.55 | 492.64 | 99.90 | 183.84 |
BKT fast | 61.00 | 13.85 | 97.50 | 5.96 | 62.60 | 71.98 | 96.10 | 29.81 | 64.20 | 288.76 | 97.23 | 120.59 |
BKT mixed | 95.00 | 23.51 | 100.00 | 8.33 | 95.40 | 113.67 | 99.90 | 40.93 | 94.53 | 466.55 | 99.68 | 169.86 |
B 2KT | 94.50 | 24.04 | 100.00 | 7.88 | 97.70 | 120.87 | 98.40 | 32.61 | 96.68 | 493.00 | 96.66 | 120.05 |
Experimental Results
Students with different learning behaviors. We study the equity of tutoring policies when the students are sampled uniformly from two groups, each containing students with learning dynamics described by a ground truth BKT model. In particular, we build on the experimental setup from [4] where there is a group of slow learners (BKT slow) and fast learners (BKT fast). In [4], the authors also fitted a BKT model to interaction data from students from both groups; we refer to the corresponding BKT model as BKT mixed. The parameters of the considered models are as follows:
BKT slow | ||||
BKT fast | ||||
BKT mixed |
We considered the interaction with 400 students, 200 from the slow and the fast group, respectively, and we compared the performance of Threshold(0.95) tutoring policies based on these models for different numbers of skills that ought to be taught in Table 1. We observe that in the case of mismatch of the student properties and the BKT models used for the threshold policy, either only a small fraction of the skills (clearly below 95 %) is acquired or that more than necessary time is spent exercising. The mismatch issue is alleviated in the case of the B2KT model (assuming a uniform prior over both types of students), in particular for a larger number of skills. Intuitively this is because, in the case of multiple skills, the model has more opportunities to learn about the students’ characteristics and leverage this knowledge in later tutoring. This fact is also illustrated in Figure 2 in which we reproduce and extend an experiment from [4] in which we compare the “equity gap” (the difference in the percentage of skills mastered by fast and slow students, respectively) to the number of excess learning opportunities. Importantly, B2KT becomes more equitable as more skills are taught.
Out-of-distribution generalization. We test whether B2KT can help with aspects relevant to inclusion and diversity. In particular, we consider a stylized mismatch setting in which a tutoring system interacts with students who have a learning behavior not considered when building the system. In addition to the previous two types of students, we assume a third type of learner (BKT med) with the following parameters: . We considered Threshold(0.95) policies based on BKT models of slow and fast learners and the B2KT model with a uniform prior over slow and fast learners. Our results are presented in Table 2. We observe that the performance of the policies derived from the B2KT model have comparable performance to those derived from the true model (although the true model has zero posterior probability) whereas other models yield policies worse in terms of stopping at the right time or teaching the right amount of skills. This property of B2KT can be helpful for promoting inclusion, e.g., when interacting with students who were underrepresented in the data used for building an intelligent tutoring system.
Fair next step predictions do not necessarily imply equitable tutoring. We show empirically that models which might appear to be fair when looking at their AUCs for different groups of students do not necessarily yield equitable tutoring policies. In particular, we again focus on a student population consisting of two groups of students:
Group 1 | ||||
Group 2 |
We generated data of 400 students ( from group 1 and group 2, respectively) in a setting with 20 skills and 1000 random exercises from a BKT model. The true model of group 1’s students achieved an AUC of for group 1’s students, while the true model of group 2’s students achieved an AUC of for group 2’s students.
Looking only at the AUC, the two models appear rather inequitable (there is no group parity). Thus it might appear sensible to aim to use a BKT model for tutoring which has comparable AUCs for both groups in order to promote equity. For instance, a BKT model using parameters , , , achieves an AUC of on group 1’s students and of on group 2’s students, respectively. That is, the AUCs on the two groups are approximately equal. However, when looking at the different models with respect to their tutoring performance using a Threshold()-policy, we observe a very different picture, cf. Table 3. In particular, the fraction of skills taught differs significantly between the two groups: In group 1 only of the skills are acquired by the students on average while in group 2 of the skills are acquired. This finding is closely related to the observation that models with greatly different characteristics can have similar AUCs [13].
1 skill | 5 skills | 20 skills
| ||||
---|---|---|---|---|---|---|
BKT med | BKT med | BKT med
| ||||
Threshold(0.95) | % skills | % skills | % skills | |||
BKT slow | 99.50 | 11.28 | 99.55 | 56.45 | 99.59 | 225.25 |
BKT fast | 90.75 | 7.61 | 91.55 | 37.59 | 91.70 | 151.36 |
BKT mixed | 99.50 | 10.46 | 99.00 | 51.71 | 99.21 | 211.29 |
BKT med | 98.25 | 8.82 | 97.35 | 45.59 | 97.84 | 184.20 |
B2KT | 98.75 | 10.33 | 97.50 | 48.80 | 94.19 | 168.36 |
group fair wrt AUC | true model wrt group
| |||||
---|---|---|---|---|---|---|
group | AUC | % skills | AUC | % skills | ||
group 1 | 0.6719 | 28.68 | 61 | 0.7393 | 96.13 | 308 |
group 2 | 0.6733 | 74.70 | 64 | 0.6710 | 96.35 | 105 |
7. CONCLUSION
We considered the equity and fairness of curricula derived from knowledge tracing models, and provided a unifying definition of equitable tutoring systems. Our definition is, in many practical settings, not realizable but suggests that the individualization of tutoring policies to students is key for realizing equity. We proposed the B2KT model, a Bayesian variant of the classical BKT model, and demonstrated in various experiments that it can be beneficial for realizing equitable tutoring systems and promoting IDEA more generally. Furthermore, we highlighted that improving and evaluating models with the main focus on next-step predictions can be insufficient to develop equitable tutoring systems.
8. ACKNOWLEDGMENTS
Adish Singla acknowledges support by the European Research
Council (ERC) under the Horizon Europe programme (ERC
StG, grant agreement No. 101039090).
Sebastian Tschiatschek acknowledges funding by the Vienna
Science and Technology Fund (WWTF) and the City of Vienna
through project ICT20-058.
9. REFERENCES
- T. L. Beauchamp, J. F. Childress, et al. Principles of biomedical ethics. Oxford University Press, USA, 2001.
- A. T. Corbett and J. R. Anderson. Knowledge tracing: Modeling the acquisition of procedural knowledge. User modeling and user-adapted interaction, 4(4):253–278, 1994.
- C. Dimitrakakis, Y. Liu, D. C. Parkes, and G. Radanovic. Bayesian fairness. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):509–516, 2019.
- S. Doroudi and E. Brunskill. Fairer but not fair enough on the equitability of knowledge tracing. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge, pages 335–339, 2019.
- J. He-Yueya and A. Singla. Quizzing policy using reinforcement learning for inferring the student knowledge state. International Educational Data Mining Society, 2021.
- T. Käser, S. Klingler, and M. Gross. When to stop? towards universal instructional policies. In Proceedings of the sixth international conference on learning analytics & knowledge, pages 289–298, 2016.
- R. F. Kizilcec and H. Lee. Algorithmic fairness in education. arXiv preprint arXiv:2007.05443, 2020.
- H. Lee and R. F. Kizilcec. Evaluation of fairness trade-offs in predicting student success. arXiv preprint arXiv:2007.00088, 2020.
- J. I. Lee and E. Brunskill. The impact on individualizing student models on necessary practice opportunities. International Educational Data Mining Society, 2012.
- Z. A. Pardos and N. T. Heffernan. Modeling individualization in a bayesian networks implementation of knowledge tracing. In International conference on user modeling, adaptation, and personalization, pages 255–266. Springer, 2010.
- G. Paviotti, P. G. Rossi, and D. Zarka. Intelligent tutoring systems: an overview. Pensa Multimedia, pages 1–176, 2012.
- R. Pelánek. Conceptual issues in mastery criteria: Differentiating uncertainty and degrees of knowledge. In International Conference on Artificial Intelligence in Education, pages 450–461. Springer, 2018.
- R. Pelánek. The details matter: methodological nuances in the evaluation of student models. User Modeling and User-Adapted Interaction, 28(3):207–235, 2018.
- A. Singla, A. N. Rafferty, G. Radanovic, and N. T. Heffernan. Reinforcement learning for education: Opportunities and challenges. arXiv preprint arXiv:2107.08828, 2021.
- S. Tschiatschek, M. Knobelsdorf, and A. Singla. Equity and Fairness of Bayesian Knowledge Tracing. arXiv preprint arXiv:2205.02333, 2022.
- Z. Wang, S. Tschiatschek, S. Woodhead, J. M. Hernández-Lobato, S. Peyton Jones, R. G. Baraniuk, and C. Zhang. Educational question mining at scale: Prediction, analysis and personalization. AAAI Conference on Artificial Intelligence, 35(17):15669–15677, 2021.
- R. Yu, Q. Li, C. Fischer, S. Doroudi, and D. Xu. Towards accurate and fair prediction of college success: Evaluating different sources of student data. International Educational Data Mining Society, 2020.
- M. V. Yudelson, K. R. Koedinger, and G. J. Gordon. Individualized bayesian knowledge tracing models. In International Conference on Artificial Intelligence in Education, pages 171–180. Springer, 2013.
- W. Zhang, J. C. Weiss, S. Zhou, and T. Walsh. Fairness amidst non-iid graph data: A literature review. arXiv preprint arXiv:2202.07170, 2022.
1Our definition can be easily generalized to account for an individual student’s maximal achievable knowledge.
© 2022 Copyright is held by the author(s). This work is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.