Learning Expert Models for Educationally Relevant Tasks using Reinforcement Learning
Christopher Maclellan, Adit Gupta
Jun 30, 2021 16:20 UTC+2
—
Session B1
—
Zoom link
Keywords: Tutor Authoring, Expert Model Authoring, Reinforcement Learning, Computational Models of Learning
Abstract:
There has been great progress towards Reinforcement Learning (RL) approaches that can achieve expert performance across a wide range of domains. However, researchers have not yet applied these models to automatically learn expert models for educationally relevant tasks, such as those taught within tutoring systems and educational games. In this paper we explore the use of Proximal Policy Optimization for learning expert models for tutoring system tasks. We explore two alternative state and action space representations for this RL approach in the context of two intelligent tutoring systems (a fraction arithmetic tutor and a multicolumn addition tutor). We also compare the performance of these models to a computational model of learning built using the Apprentice Learner architecture. To evaluate these models, we look at whether they achieve mastery and how many training opportunities they take to do so. Our analysis shows that at least one PPO model is able to successfully achieve mastery within both tutors, suggesting that RL models might be successfully applied to learn expert models for educationally relevant tasks. Further, we find that the Apprentice Learner model also achieves mastery, but requires substantially less training (hundreds to thousands of times less examples) than the PPO approaches. Finally, we find that there is an interaction between the PPO representation and task (one representation is better for one tutor and the other representation is better for the other tutor), suggesting that the design of the state and action representations for RL approaches is important for success. Our work showcases the promise of RL approaches for expert model discovery in educationally relevant tasks, but highlights multiple challenges that will need further research to overcome.