Introduction to the Proceedings
Preface
The Georgia Institute of Technology is proud to host the seventeenth International Conference on Educational Data Mining (EDM) in Atlanta, Georgia, July 14-July 17, 2024. EDM is the annual flagship conference of the International Educational Data Mining Society. This year’s theme is “New tools, new prospects, new risks – educational data mining in the age of generative AI.” The theme focuses on the movement from descriptive and predictive models to generative artificial intelligence (AI) and what that means for learning environments and processes. While the new methods unlock exciting new potentials for educational data mining, they also foreground many ethical considerations and risks that are associated with all types of machine learning and artificial intelligence. This year, we additionally welcomed research in the following areas: mitigating biases and harms that may result from model use, accounting for the stereotypes that are inherent to the large models that drive generative AI, separating the hype surrounding these new technologies from their potential in educational settings, and finding ways to use these models to better understand learning processes and support learning.
The scientific programming for EDM2024 includes
Two keynote talks by outstanding researchers in the field, Tanja Käser (EPFL, Switzerland) and Carolyn Rosé (CMU, USA)
A plenary Test of Time award talk by Gautam Biswas (Vanderbilt University, USA)
A plenary Data Set award talk by Jakub Kužílek (Humboldt-Universität zu Berlin & German Research Center for Artificial Intelligence, Germany) and Martin Hlosta (Institute for Distance Learning and eLearning Research & Swiss Distance University of Applied Sciences, Switzerland)
Five tutorials (foundational as well as advanced)
Five workshops
Twelve paper presentation sessions on the topics of large language models in education (two sessions), affective computing, learning analytics and recommender systems, reinforcement learning and pedagogical agents, knowledge tracing and curricula, collaborative learning, prediction and supervised learning, research practices (two sessions), and computer science education (two sessions)
Two poster presentation sessions
An industry track and industry panel
A doctoral student consortium
The tutorials are: (1) Logistic Knowledge Tracing Tutorial, (2) Promoting Open Science in Educational Data Mining, (3) Thinking Causally in EDM, (4) Beyond Tutor Logs: Utilizing Sensor Data for Measuring Student Behavior, and (5) Tools for Planning and Analyzing Randomized Controlled Trials and A/B Tests. The workshops are: (1) 8th Educational Data Mining in Computer Science Education (CSEDM) Workshop, (2) Human-Centric eXplainable AI in Education (HEXED) Workshop, (3) Causal Inference in Educational Data Mining, (4) Leveraging Large Language Models for Next-Generation Educational Technologies, and (5) Educational Data Mining in Writing and Literacy Instruction.
The venue is the Global Learning Center of Georgia Tech, home of Georgia Tech’s Division of Lifetime Learning. The Global Learning Center is nestled in Tech Square, a burgeoning tech hub in Midtown Atlanta attached to the Georgia Tech Hotel & Conference Center and across the street from the Tech Square Research Building and the Coda Building, home of many of the faculty of the institute’s College of Computing. EDM 2024 received 81 submissions to the full papers track (10 pages), 68 to the short papers track (6 pages), and 29 to the poster and demo track (4 pages). The program committee accepted 21 full papers (26 The EDM 2024 industry track and the industry panel fostered exchange between industry application research and basic research. Four papers and two posters were included in the industry track. Panelists included Kristen DiCerbo (Khan Academy), Diego Zapata-Rivera (ETS Research Institute), Lewis Johnson (Alelo), and Bibi Groot (EeDI). EDM 2024 also continued its tradition of providing opportunities for young researchers to present their work and receive feedback from their peers and senior researchers. The doctoral consortium this year features 12 such participants. EDM 2024 is especially proud to offer travel sponsorships to 15 students who will attend EDM thanks to this support. We thank the sponsors of EDM 2024 for their generous support. We thank all the authors who submitted their work and the program committee members for their expert inputs. We thank the members of the organization committee for their leadership that made this conference possible. And, a big Thank You to the local organizing committee who made this event memorable.
Carrie Demmans Epp | University of Alberta, Canada | Program Chair |
Benjamin Paaßen | Bielefeld University, Germany | Program Chair |
David Joyner | Georgia Institute of Technology, USA | General Chair |
July 14th, 2024
Atlanta, GA, USA
Organizing Committee
General Chairs
David Joyner – Georgia Institute of Technology, USA
Program Chairs
Benjamin Paaßen (they/them) – Bielefeld University, Germany
Carrie Demmans Epp (she/her) – University of Alberta, Canada
Equity, Diversity, and Inclusion Chairs
Anna Rafferty (she/her) – Carleton College, USA
Jie Tang (he/him) – Tsinghua University, China
Accessibility Chairs
Nigel Bosch (he/him) – University of Illinois Urbana-Champaign, USA
Industry Track Chairs
Carol M. Forsyth (she/her) – Educational Testing Service, USA
Avi Segal (he/him) – Ben-Gurion University of the Negev, Israel
Poster & Demo Track Chairs
Heeryung Choi (she/her) – University of Michigan, USA
Irena Koprinska (she/her) – The University of Sydney, Australia
Doctoral Consortium Chairs
Neil Heffernan – Worcester Polytechnic Institute, USA
Luc Paquette (he/him) – University of Illinois Urbana-Champaign, USA
JEDM Track Chairs
Maria Mercedes T. Rodrigo (she/her) – Ateneo de Manila University, Philippines
Agathe Merceron (she/her) – Berlin Hochschule für Technik, Germany
Jaclyn Ocumpaugh – University of Pennsylvania, USA
Workshop & Tutorials Chairs
Bita Akram (she/her) – North Carolina State University, USA
Sergey Sosnovsky – Utrecht University, Netherlands
Awards Chairs
Danielle McNamara – Arizona State University, USA
Cristóbal Romero – University of Córdoba, Spain
Marianne Winslett – University of Illinois Urbana-Champaign, USA
Scholarship Chairs
Luc Paquette (he/him) – University of Illinois Urbana-Champaign, USA
Ryan S. Baker – University of Pennsylvania, USA
Student Volunteer Chair
Sherry Sahebi (she/her) – University at Albany – SUNY, USA
Social Media & Publicity Chairs
Oleksandra Poquet – Technical University of Munich, Germany
Jill-Jênn Vie (he/him) – Inria Saclay, France
Web Chairs
Paul Salvador Inventado (he/him) – California State University Fullerton, USA
Ramkumar Rajendran (he/him) – Indian Institute of Technology Bombay, India
Rafael D. Araújo (he/him) – Federal University of Uberlândia, Brazil
Proceedings Chairs
Mirko Marras (he/him) – University of Cagliari, Italy
Maomi Ueno (he/him) – The University of Electro-Communications, Japan
Sponsorship Chairs
Hannah Moon (she/her) – Georgia Institute of Technology, USA
Online Experience Chair
Nick Lytle (he/him) – Georgia Institute of Technology, USA
Local Organizing Team
David Joyner – Georgia Institute of Technology, USA
Hannah Moon (she/her) – Georgia Institute of Technology, USA
Nick Lytle (he/him) – Georgia Institute of Technology, USA
Alex Duncan (he/him) – Georgia Institute of Technology, USA
IEDMS Officers
Tiffany Barnes, | President | North Carolina State University, USA |
Anna Rafferty, | Treasurer | Carleton College, USA |
IEDMS Board of Directors
Ryan Baker | University of Pennsylvania, USA |
Neil Heffernan | Worcester Polytechnic Institute, USA |
Sharon Hsiao | Santa Clara University, USA |
Tanja Käser | EPFL, CH |
Kenneth Koedinger | Carnegie Mellon University, USA |
Kalina Yacef | University of Sydney, Australia |
Senior Program Committee Members
Bita Akram | North Carolina State University |
Giora Alexandron | Weizmann Institute of Science |
Roger Azevedo | University of Central Florida |
Ryan Baker | University of Pennsylvania |
Tiffany Barnes | North Carolina State University |
Gautam Biswas | Vanderbilt University |
Ig Ibert Bittencourt | Federal University of Alagoas |
Nigel Bosch | University of Illinois Urbana-Champaign |
Anthony F. Botelho | University of Florida |
François Bouchet | Sorbonne Université - LIP6 |
Alex Bowers | Columbia University |
Min Chi | BeiKaZhouLi |
Anat Cohen | Tel-Aviv University |
Cristina Conati | The University of British Columbia |
Linda Corrin | Deakin University |
Alexandra Cristea | Durham University |
Michel Desmarais | Ecole Polytechnique de Montreal |
Fabiano Dorça | Universidade Federal de Uberlandia |
Michael Eagle | George Mason University |
Vanessa Echeverria | Monash University |
Yo Ehara | Tokyo Gakugei University |
Mingyu Feng | WestEd |
Carol Forsyth | Educational Testing Service |
Kobi Gal | The University of Edinburgh |
Praveen Garimella | International Institute of Information Technology |
Dragan Gasevic | Monash University |
Neil Heffernan | Worcester Polytechnic Institute |
Sharon Hsiao | Santa Clara University |
Xiao Hu | The University of Hong Kong |
Paul Salvador Inventado | California State University Fullerton |
Johan Jeuring | Utrecht University |
Srecko Joksimovic | Education Future, University of South Australia |
Jelena Jovanovic | University of Belgrade |
Tanja Käser | EPFL |
Enkelejda Kasneci | Technical University of Munich |
Hiroaki Kawashima | University of Hyogo |
Kirsty Kitto | University of Technology, Sydney |
Rene Kizilcec | Cornell University |
Simon Knight | UTS |
Kenneth Koedinger | Carnegie Mellon University |
Irena Koprinska | The University of Sydney |
Andrew Lan | University of Massachusetts Amherst |
Mirko Marras | University of Cagliari |
Roberto Angel Melendez | Instituto Tecnologico Superior de Misantla |
Agathe Merceron | Berliner Hochschule für Technik - Berlin State University of Applied Sciences |
Tanja Mitrovic | Intelligent Computer Tutoring Group, University of Canterbury, Christchurch |
Roger Nkambou | Université du Québec à Montréal |
Andrew Olney | University of Memphis |
Ranilson Paiva | Universidade Federal de Alagoas |
Luc Paquette | University of Illinois at Urbana-Champaign |
Zach Pardos | University of California, Berkeley |
Radek Pelánek | Masaryk University Brno |
Thomas Price | North Carolina State University |
Anna Rafferty | Carleton College |
R Rajalakshmi | VIT University, Chennai Campus |
Ramkumar Rajendran | IIT Bombay |
Steven Ritter | Carnegie Learning, Inc. |
Maria Mercedes T. Rodrigo | Department of Information Systems and Computer Science, Ateneo de Manila University |
Cristobal Romero | Department of Computer Sciences and Numerical Analysis |
Sherry Sahebi | University at Albany - SUNY |
Demetrios Sampson | Curtin University |
Olga Santos | UNED |
Avi Segal | Ben Gurion University |
Niels Seidel | FernUniversität in Hagen |
Atsushi Shimada | Kyushu University |
Sergey Sosnovsky | Utrecht University |
Tiffany Tang | Wenzhou-Kean University |
Jill-Jênn Vie | Inria Lille |
Lanqin Zheng | Beijing Normal University |
Program Committee Members
Mark Abdelshiheed | University of Colorado Boulder |
Faruk Ahmed | The University of Memphis |
Nazia Alam | North Carolina State University |
Laia Albó | Universitat Pompeu Fabra |
Laura Allen | University of Minnesota |
Isaac Alpizar Chacon | Utrecht University |
Mohammad Alshehri | Durham University |
Alvaro Alvares | Federal University of the Agreste of Pernambuco |
Gisele Arevalo | University of Alberta |
Simón Pedro Arguijo | Tecnológico Nacional de México campus Misantla |
T.S. Ashwin | Vanderbilt University |
Ayan Banerjee | Arizona State University |
Denilson Barbosa | University of Alberta |
Abhinava Barthakur | University of South Australia |
Prateek Basavaraj | American Association of State Colleges and Universities |
Marie Bexte | FernUniversität in Hagen |
Anis Bey | La Rochelle University |
Plaban Kumar Bhowmick | Indian Institute of Technology Kharagpur |
Nathaniel Blanchard | Pontificia Universidad Católica de Esmeraldas |
Maria Bolsinova | Tilburg University |
Conrad Borchers | Carnegie Mellon University |
Jesus G. Boticario | UNED |
Marie-Luce Bourguet | Queen Mary London |
Matthieu Brinkhuis | Utrecht University |
Ted Briscoe | University of Cambridge |
Julien Broisin | IRIT, Université Toulouse III - Paul Sabatier, Toulouse, France |
Minghao Cai | University of Alberta |
Renza Campagni | Università degli Studi di Firenze |
Jie Cao | University of Colorado Boulder |
Meng Cao | University of Memphis |
Nicolás Cardozo | Universidad de los Andes |
Paulo Carvalho | Carnegie Mellon University |
Guanliang Chen | Monash University |
Heeryung Choi | Massachusetts Institute of Technology |
Wei Chu | The University of Memphis |
Cheng-Yu Chung | National Yang Ming Chiao Tung University |
Ruth Cobos | Universidad Autónoma de Madrid |
Aubrey Condor | University of California Berkeley |
Maria de Los Angeles Constantino González | Tecnológico de Monterrey Campus Laguna |
Maria Cutumisu | McGill University |
Jesper Dannath | Universität Bielefeld |
Syaamantak Das | Indian Institute of Technology Bombay |
Alina Deriyeva | Universtity of Bielefeld |
M Ali Akber Dewan | Athabasca University |
Nicholas Diana | Colgate University |
Fahima Djelil | IMT Atlantique |
Mohsen Dorodchi | University of North Carolina Charlotte |
Cristina Dumdumaya | University of Southeastern Philippines |
Nghia Duong-Trung | German Research Centre for Artificial Intelligence |
Luke Eglington | Amplify |
Yo Ehara | Tokyo Gakugei University |
Samira Elatia | University of Alberta |
Fahmid Morshed Fahid | North Carolina State University |
Yizhou Fan | Peking University |
Stephen Fancsali | Carnegie Learning, Inc. |
Effat Farhana | Vanderbilt University |
Márcia Fernandes | Federal University of Uberlândia |
Nigel Fernandez | University of Massachusetts Amherst |
Jeremiah Folsom-Kovarik | Soar Technology, Inc. |
Kazuma Fuchimoto | The University of Electro-Communications |
Hagit Gabbay | School of Education, Tel Aviv University |
Kobi Gal | The University of Edinburgh |
Wenbin Gan | National Institute of Information and Communications Technology |
Mark Gierl | University of Alberta |
Aldo Gordillo | Universidad Politécnica de Madrid (UPM) |
Guher Gorgun | University of Alberta |
Art Graesser | University of Memphis |
Sabine Graf | Athabasca University |
Monique Grandbastien | LORIA, Universite de Lorraine |
Julio Guerra | Universidad Austral de Chile |
Ella Haig | School of Computing, University of Portsmouth |
Ching Nam Hang | City University of Hong Kong |
Jiangang Hao | Educational Testing Service |
Ellie Hajarian | Athabasca University |
Jason Harley | McGill University |
Erik Harpstead | Carnegie Mellon University |
Fatima Harrak | Sorbonne Université - LIP6 |
Carl Haynes-Magyar | Carnegie Mellon University |
Surina He | University of Alberta |
Sami Heikkinen | LAB University of Applied Sciences |
Arto Hellas | Aalto University |
Erik Hemberg | ALFA |
Nicolas Hernandez | Nantes Université - LS2N CNRS UMR 6004 |
Martin Hlosta | The Swiss Distance University of Applied Sciences |
Anett Hoppe | TIB Leibniz Information Centre for Science and Technology; L3S Research Centre, Leibniz Universität Hannover |
Lingyun Huang | The Education University of Hong Kong |
Yun Huang | Austral University of Chile |
Paul Hur | University of Illinois at Urbana-Champaign |
Sébastien Iksal | LIUM - Le Mans Université, France |
Vladimir Ivančević | University of Novi Sad, Faculty of Technical Sciences |
Hyeji Jang | Ewha Womans University |
Emily Jensen | University of Colorado Boulder |
Lan Jiang | University of Illinois Urbana-Champaign |
David Joyner | Georgia Institute of Technology |
Jina Kang | University of Illinois Urbana-Champaign |
Tanja Käser | EPFL |
Mohammad Khalil | University of Bergen |
Ekaterina Kochmar | MBZUAI |
Elizabeth Koh | National Institute of Education, Nanyang Technological University, Singapore |
Sotiris Kotsiantis | University of Patras |
Vitomir Kovanovic | The University of South Australia |
Milos Kravcik | DFKI GmbH |
Swathi Krishnaraja | University of Potsdam |
Roland Kuhn | National Research Council of Canada |
Amruth Kumar | Ramapo College of New Jersey |
Vive Kumar | Athabasca University |
Vishal Kuvar | University of Minnesota |
Hollis Lai | University of Alberta |
Sébastien Lallé | Sorbonne University |
Andrew Lan | University of Massachusetts Amherst |
Juan Alfonso Lara Torralbo | University of Córdoba |
Mikel Larrañaga | University of the Basque Country |
Elise Lavoué | iaelyon, Université Jean Moulin Lyon 3, LIRIS |
Tai Le Quy | IU International University of Applied Sciences |
Vwen Yen Alwyn Lee | Nanyang Technological University |
Marie Lefevre | LIRIS - Université Lyon 1 |
Juho Leinonen | Aalto University |
Arun Balajiee Lekshmi Narayanan | University of Pittsburgh |
James Lester | North Carolina State University |
Chenglu Li | University of Florida |
Jiawei Li | Nanyang Technological University |
Warren Li | University of Michigan |
Jionghao Lin | Carnegie Mellon University |
Qi Liu | University of Science and Technology of China |
Zhexiong Liu | University of Pittsburgh |
Sonsoles López-Pernas | University of Eastern Finland |
Yu Lu | Beijing Normal University |
Vanda Luengo | Sorbonne Université - LIP6 |
Ivan Luković | University of Belgrade, Faculty of Organizational Sciences |
Collin Lynch | North Carolina State University |
Nick Lytle | Georgia Tech Univeristy |
Boxuan Ma | Kyshu University |
Qiang Ma | Kyoto Institute of Technology |
Qianou Ma | Carnegie Mellon University Human-Computer Interaction Institute |
Jeffrey Matayoshi | McGraw Hill ALEKS |
Madeth May | University of Maine |
Gord McCalla | University of Saskatchewan |
Emma McDonald | University of Alberta |
Guilherme Medeiros Machado | ECE Paris |
Victor Menendez-Dominguez | Universidad Autónoma de Yucatán |
Donatella Merlini | Università di Firenze |
Caitlin Mills | University of Minnesota |
Tsunenori Mine | Kyushu University |
Tsubasa Minematsu | Kyushu University |
Sein Minn | INRIA |
Phaedra Mohammed | The University of the West Indies |
Luis Alberto Morales Rosales | Conacyt-Universidad Michoacana de San Nicolás de Hidalgo |
Matthew Moreno | McGill University |
Pedro Manuel Moreno-Marcos | Universidad Carlos III de Madrid |
Bradford Mott | North Carolina State University |
Kousuke Mouri | Tokyo University of Agriculture and Technology |
Calarina Muslimani | University of Alberta |
Tanya Nazaretsky | EPFL |
Huy Nguyen | University of Pittsburgh |
Narges Norouzi | Berkeley |
Ange Adrienne Nyamen Tato | École de Technologie Supérieure |
Teresa Ober | Educational Testing Service |
Püren Öncel | University of Minnesota Twin Cities |
Tounwendyam Frédéric Ouedraogo | Université Norbert ZONGO |
Maciej Pankiewicz | Warsaw University of Life Sciences |
Yeonjeong Park | Honam University |
Philip Pavlik | University of Memphis |
Jorge Poco | Fundação Getulio Vargas |
Paul Stefan Popescu | University of Craiova |
Oleksandra Poquet | Technical University of Munich |
Ethan Prihar | École polytechnique fédérale de Lausanne |
David Pritchard | Massachusetts Institute of Technology |
Miroslava Raspopović | Faculty of Information Technology |
Narjes Rohani | University of Edinburgh |
José Raúl Romero | University of Cordoba |
Daniela Rotelli | University of Pisa |
Mirka Saarela | University of Jyväskylä |
Maria Ofelia San Pedro | Roblox |
Sreecharan Sankaranarayanan | Carnegie Mellon University |
Mohammed Saqr | University of Eastern Finland |
Petra Sauer | beuth university of applied sciences |
Robin Matthias Schmucker | Carnegie Mellon University |
Filippo Sciarrone | Universitas Mercatorum |
Kazuhisa Seta | Osaka Metropolitan University |
Lele Sha | Monash University |
Lei Shi | Newcastle University |
Yang Shi | Utah State University |
Jinnie Shin | University of Florida |
Aditi Singh | Cleveland State University |
Daevesh Singh | Indian Institute of Technology |
Stefan Slater | Teachers College |
Juyeong Song | Ewha Womans University |
Frank Stinar | University of Illinois - Urbana Champaign |
Vinitra Swamy | EPFL |
Anaïs Tack | KU Leuven |
Ling Tan | Australian Council for Educational Research |
Michelle Taub | University of Central Florida |
Daniela Teodorescu | LMU Munich |
Craig Thompson | The University of British Columbia |
Emiko Tsutsumi | The University of Electro-Communications |
Maomi Ueno | The University of Electro-Communications |
Maya Usher | Technion |
Masaki Uto | The University of Electro-Communications |
Sowmya Vajjala | National Research Council, Canada |
José Antonio Hiram Vázquez-López | Tecnológico Nacional de México, campus Instituto Tecnológico Superior de Misantla |
Oswaldo Velez-Langs | Universidad de Cordoba |
Rémi Venant | Le Mans Université - LIUM |
Olga Viberg | KTH Royal Institute of Technology |
Markel Vigo | The University of Manchester |
Alessandro Vivas | UFVJM |
Tuyet-Trinh Vu | SOICT-HUST |
Deliang Wang | The University of Hong Kong |
Zichao Wang | Rice University |
Christabel Wayllace | New Mexico State University |
Daniel Weitekamp | Carnegie Mellon University |
Jacob Whitehill | Worcester Polytechnic Institute |
Alistair Willis | The Open University |
Aaron Wong | University of Minnesota |
Chris Wong | University of Technology Sydney |
Jacqueline Wong | Utrecht University |
Beverly Park Woolf | University of Massachusetts |
Yi-jung Wu | University of Wisconsin-Madison |
Peter Wulff | Heidelberg University of Education |
Jia Xu | Guangxi University |
Elad Yacobson | Weizmann Institue of Science |
Seyma Yildirim-Erbasli | Concordia University of Edmonton |
Chengjiu Yin | Kyushu University |
Andrew Zamecnik | University of South Australia |
Diego Zapata-Rivera | Educational Testing Service |
Jiayi Zhang | University of Pennsylvania |
Lisa Zhang | University of Toronto |
Wenbin Zhang | Florida International University |
Yingbin Zhang | South China Normal University |
Lanqin Zheng | Beijing Normal University |
Stefano Zingaro | Università di Bologna |
Sponsors
Bronze Tier
Keynotes
Generalizable and Interpretable Models of Learning
Tanja Käser, EPFL School of Computer and Communication
Sciences
Modeling learners’ knowledge, behavior, and strategies is at the heart of educational technology. Learner models serve as a basis for adapting the learning experience to students’ needs and supporting teachers in classroom orchestration. Consequently, a large body of research has focused on creating accurate models of student knowledge and behaviors. However, current modeling approaches are still limited: they are either defined for specific and well-structured domains (e.g., algebra, vocabulary learning) requiring substantial work from experts and limiting generalizability, or they lack interpretability. Recent advances in generative AI, in particular large language models (LLMs), have the potential to address these constraints. However, LLMs lack alignment with educational goals and a grounded knowledge.
In this talk, I will discuss the key challenges in developing generalizable and explainable models, and our solutions to address them, including models tracking learning in open-ended environments and generalizing between different environments and populations. I will present our work on explainable AI, including a rigorous evaluation of existing approaches, the development of inherently interpretable models, as well as studies on effectively communicating model explanations. Finally, I will show some of our recent results combining “traditional” modeling approaches and LLMs to provide interpretable feedback and explanations while not compromising on model trustworthiness.
Opportunities and Challenges for LLM Agent-Based Support for Collaborative Design
Carolyn Rosé, Professor of Language Technologies and Human-Computer Interaction, School of Computer Science, Carnegie Mellon, USA
Supporting collaborative design is an ideal context for exploring the capabilities and limitations of LLM-based conversational agents. The ability to extract information in context and produce a coherent sounding text can be used to generate reflection triggers. In two recent studies, we have employed LLM-based conversational agents with the goal of triggering human reflection and learning during collaborative software design. As humans engage in collaborative design, they employ their own abilities to reason abstractly, to decompose problems, and apply principles productively. Reflection is a valuable activity for promoting human learning in these settings. However, what humans are able to do in terms of abstraction and reasoning as part of their creative problem solving is precisely what is most difficult for LLM agents to do. In contrast to claims of “super-human performance” in the media, in this talk we will explore the complementarity of human intelligence and Artificial Intelligence. We will begin with results of a classroom study where LLM-based conversational agent support for collaborative software development was successful in increasing student learning. From there we will move on to argue in favor of a research agenda for exploiting the complementarity both in terms of applying AI capabilities to the betterment of human learning as well as inspiring further extension of technical capabilities from insights derived from observation of human reflection and learning in collaborative design.
Test of Time Award Talk
Assessing Student Learning in Open Ended Learning Environments From Sequential to Multimodal Data Analysis
Gautam Biswas, Professor of Computer Science, Vanderbilt University, Nashville, TN. USA
From my early days as an AIED and EDM researcher, I have focused on understanding how students learn, especially in scenarios where they have to construct and apply their knowledge to problem-solving tasks. Collaborating with peers, we developed open-ended learning environments (OELEs) where K-12 students build scientific models and apply them to solve real-world problems. Challenges arise, as students have to navigate with multiple tools in the computer-environments. Some students overcome these challenges to become effective learners while others struggle to progress often applying suboptimal learning strategies. John Kinnebrew and I began analyzing learners’ activity logs to study these differences, resulting in the Differential Sequence Mining algorithm, which earned us the best paper award at EDM 2012. Expanding on this work, we developed the Contextualized Difference Mining method for understanding students’ learning behaviors, for which we are receiving this Test of Time award.
In my talk, I will review our work on Differential sequence mining, and explore its applications in understanding students’ cognitive and metacognitive learning behaviors. We have leveraged these insights to provide adaptive support, helping students’ progress in our Open-Ended Learning Environments (OELEs). Beyond this, we have employed other sequential representations, such as Markov Chains and Hidden Markov Models, to analyze students’ activities and behaviors in the context of their learning and problem-solving tasks. Beyond this, our OELEs have advanced to facilitate students’ integrated learning of science, computing, and engineering problem-solving, including collaborative efforts in computer-based and embodied learning scenarios. From these richer learning environments, I will share insights into our latest efforts involving multimodal data analysis, incorporating video, speech, and activity logs. Using vision-based deep learning models and large language models (LLMs), we integrate analyses across modalities, offering a comprehensive understanding of students’ collaborative learning and problem-solving activities. In conclusion, I will discuss the potential implications of our work on shaping future of learning in classrooms.
EDM data set award
Jakub Kužílek
Senior Researcher, Computer Science Education / Computer Science and Society research group, Humboldt-Universität zu Berlin & German Research Center for Artificial Intelligence, Berlin, Germany.
Bio. Jakub Kužílek is affiliated with the Computer Science Education / Computer Science and Society research group at Humboldt-Universität zu Berlin and the Educational Technology Lab at the German Research Center for Artificial Intelligence (DFKI) as a senior researcher. His research investigates student self-regulated learning within online learning environments, collaborative group work, adaptive assessments, and feedback within the context of digital education. In the past, he developed (together with Martin Hlosta and Zdenek Zdrahal) an OU Analyse system used to support 200.000 students of the Open University (United Kingdom) during their studies and founded learning analytics research at Czech Technical University. He has led (and currently is doing research within) a project on AI use in assessment feedback at Humbold-Universität. In parallel, he is leading the project on AI-driven recommendation systems in vocational education (KIPerWeb at DFKI).
Martin Hlosta
Anand Deshpande, Founder and Chairman, Persistent Systems, IN
Bio. Martin Hlosta is a Senior Researcher at the Institute for Distance Learning and eLearning Research (IFeL) at Swiss distance university of applied sciences (FFHS). Before joining FFHS, he led research and development of OUAnalyse at The Open University (OU) – a Predictive Learning Analytics project deployed in all undergraduate courses, improving student retention and teachers practice. It is one of the world-largest deployment of analytics systems in education and in 2020 it was selected by UNESCO as one of the four best projects using AI in education. His following work focused on identifying factors contributing to large gaps of disadvantaged students in the UK, and in another study presented how using predictive analytics by teachers in an online course can lower these gaps for students coming from low Socio-Economic areas. Currently, he is leading research and teaching in Learning Analytics at FFHS and works on various strands how learning analytics can improve feedback. His most recent project funded by Unity and Meta to target inequalities in education explores how immersive Virtual Reality and enhanced analytics for reflection can help future teachers in South Africa.
JEDM Presentations
The Knowledge Component Attribution Problem for Programming: Methods and Tradeoffs with Limited Labeled Data
Yang Shi | NC State University | yshi26@ncsu.edu | |
Robin Schmucker | Carnegie Mellon University | rschmuck@cs.cmu.edu | |
Keith Tran | NC State University | ktran24@ncsu.edu | |
John Bacher | NC State University | jtbacher@ncsu.edu | |
Kenneth Koedinger | Carnegie Mellon University | koedinger@cmu.edu | |
Thomas Price | NC State University | twprice@ncsu.edu | |
Min Chi | NC State University | mchi@ncsu.edu | |
Tiffany Barnes | NC State University | tmbarnes@ncsu.edu |
Understanding students’ learning of knowledge components (KCs) is an important educational data mining task and enables many educational applications. However, in the domain of computing education, where program exercises require students to practice many KCs simultaneously, it is a challenge to attribute their errors to specific KCs and, therefore, to model student knowledge of these KCs. In this paper, we define this task as the KC attribution problem. We first demonstrate a novel approach to addressing this task using deep neural networks and explore its performance in identifying expert-defined KCs (RQ1). Because the labeling process takes costly expert resources, we further evaluate the effectiveness of transfer learning for KC attribution, using more easily acquired labels, such as problem correctness (RQ2). Finally, because prior research indicates the incorporation of educational theory in deep learning models could potentially enhance model performance, we investigated how to incorporate learning curves in the model design and evaluated their performance (RQ3). Our results show that in a supervised learning scenario, we can use a deep learning model, code2vec, to attribute KCs with a relatively high performance (AUC > 75% in two of the three examined KCs). Further using transfer learning, we achieve reasonable performance on the task without any costly expert labeling. However, the incorporation of learning curves shows limited effectiveness in this task. Our research lays important groundwork for personalized feedback for students based on which KCs they applied correctly, as well as more interpretable and accurate student models.
Automated Evaluation of Classroom Instructional Support with LLMs and BoWs: Connecting Global Predictions to Specific Feedback
Jacob Whitehill | Worcester Polytechnic Institute | jrwhitehill@wpi.edu | |
Jennifer LoCasale-Crouch | Virginia Commonwealth University | locasalecrj@vcu.edu |
With the aim to provide teachers with more specific, frequent, and actionable feedback about their teaching, we explore how Large Language Models (LLMs) can be used to estimate “Instructional Support” domain scores of the CLassroom Assessment Scoring System (CLASS), a widely used observation protocol. We design a machine learning architecture that uses either zero-shot prompting of Meta’s Llama2, and/or a classic Bag of Words (BoW) model, to classify individual utterances of teachers’ speech (transcribed automatically using OpenAI’s Whisper) for the presence of Instructional Support. Then, these utterance-level judgments are aggregated over a 15-min observation session to estimate a global CLASS score. Experiments on two CLASS-coded datasets of toddler and pre-kindergarten classrooms indicate that (1) automatic CLASS Instructional Support estimation accuracy using the proposed method (Pearson R up to 0.48) approaches human inter-rater reliability (up to R = 0.55); (2) LLMs generally yield slightly greater accuracy than BoW for this task, though the best models often combined features extracted from both LLM and BoW; and (3) for classifying individual utterances, there is still room for improvement of automated methods compared to human-level judgments. Finally, (4) we illustrate how the model’s outputs can be visualized at the utterance level to provide teachers with explainable feedback on which utterances were most positively or negatively correlated with specific CLASS dimensions.
An Approach to Improve k-Anonymization Practices in Educational Data Mining
Frank Stinar | University of Illinois Urbana–Champaign | fstinar2@illinois.edu | |
Zihan Xiong | University of Pennsylvania | zihanx3@seas.upenn.edu | |
Nigel Bosch | University of Illinois Urbana–Champaign | pnb@illinois.edu |
With the aim to provide teachers with more specific, frequent, and actionable feedback about their teaching, Educational data mining has allowed for large improvements in educational outcomes and understanding of educational processes. However, there remains a constant tension between educational data mining advances and protecting student privacy while using educational datasets. Publicly available datasets have facilitated numerous research projects while striving to preserve student privacy via strict anonymization protocols (e.g., k-anonymity); however, little is known about the relationship between anonymization and utility of educational datasets for downstream educational data mining tasks, nor how anonymization processes might be improved for such tasks. We provide a framework for strictly anonymizing educational datasets with a focus on improving downstream performance in common tasks such as student outcome prediction. We evaluate our anonymization framework on five diverse educational datasets with machine learning-based downstream task examples to demonstrate both the effect of anonymization and our means to improve it. Our method improves downstream machine learning accuracy versus baseline data anonymization by 30.59%, on average, by guiding the anonymization process toward strategies that anonymize the least important information while leaving the most valuable information intact.
Exploring the Impact of Symbol Spacing and Problem Sequencing on Arithmetic Performance: An Educational Data Mining Approach
Avery Harrison Closser | Purdue University | aclosser@purdue.edu | |
Anthony F. Botelho | University of Florida | abotelho@coe.ufl.edu | |
Jenny Yun-Chen Chan | The Education University of Hong Kong | chanjyc@eduhk.hk |
Experimental research on perception and cognition has shown that inherent and manipulated visual features of mathematics problems impact individuals’ problem-solving behavior and performance. In a recent study, we manipulated the spacing between symbols in arithmetic expressions to examine its effect on 174 undergraduate students’ arithmetic performance but found results that were contradictory to most of the literature (Closser et al., 2023). Here, we applied educational data mining (EDM) methods to that dataset at the problem level to investigate whether inherent features of the 32 experimental problems (i.e., problem composition, problem order) may have caused unintended effects on students’ performance. We found that students were consistently faster to correctly simplify expressions with the higher-order operator on the left, rather than right, side of the expression. Furthermore, average response times varied based on the symbol spacing of the current and preceding problem, suggesting that problem sequencing matters. However, including or excluding problem identifiers in analyses changed the interpretation of results, suggesting that the effect of sequencing may be impacted by other, undefined problem-level factors. These results advance cognitive theories on perceptual learning and provide implications for educational researchers: online experiments designed to investigate students’ performance on mathematics problems should include a variety of problems, systematically examine the effects of problem order, and consider applying different data analysis approaches to detect effects of inherent problem features. Moreover, EDM methods can be a tool to identify nuanced effects on behavior and performance in the context of data from online platforms.
Effect of Gamification on Gamers: Evaluating Interventions for Students Who Game the System
Kirk Vanacore | Worcester Polytechnic Institute | kpvanacore@wpi.edu | |
Ashish Gurung | Carnegie Mellon University | agurung@andrew.cmu.edu | |
Adam Sales | Worcester Polytechnic Institute | asales@wpi.edu | |
Neil Heffernan | Worcester Polytechnic Institute | nth@wpi.edu |
Gaming the system is a persistent problem in Computer-Based Learning Platforms. While substantial progress has been made in identifying and understanding such behaviors, effective interventions remain scarce. This study explores the impact of two types of interventions – gamification and manipulation of assistance access – on the learning outcomes of students who tend to game the system using a method of causal moderation known as Fully Latent Principal Stratification. The results indicate that gamification does not consistently mitigate these negative behaviors. One gamified condition had a consistently positive effect on learning regardless of students’ propensity to game the system, whereas the other had a negative effect on such students. However, delaying access to hints and feedback may have a positive effect on the learning outcomes of those gaming the system. This paper also illustrates the potential of integrating detection and causal methodologies within education data mining for understanding how to respond to behaviors effectively after they are detected.
LearnSphere: A Learning Data and Analytics Cyberinfrastructure
John Stamper | Carnegie Mellon University | jstamper@cmu.edu | |
Philip I. Pavlik Jr. | University of Memphis | ppavlik@memphis.edu | |
Steven Moore | Carnegie Mellon University | stevenmo@andrew.cmu.edu | |
Kenneth Koedinger | Carnegie Mellon University | koedinger@cmu.edu |
LearnSphere is a web-based data infrastructure designed to transform scientific discovery and innovation in education. It supports learning researchers in addressing a broad range of issues including cognitive, social, and motivational factors in learning, educational content analysis, and educational technology innovation. LearnSphere integrates previously separate educational data and analytic resources developed by participating institutions. The web-based workflow authoring tool, Tigris, allows technical users to contribute sophisticated analytic methods, and learning researchers can adapt and apply those methods using graphical user interfaces, importantly, without additional programming. As part of our use-driven design of LearnSphere, we built a community through workshops and summer schools on educational data mining. Researchers interested in particular student levels or content domains can find student data from elementary through higher-education and across a wide variety of course content such as math, science, computing, and language learning. LearnSphere has facilitated many discoveries about learning, including the importance of active over passive learning activities and the positive association of quality discussion board posts with learning outcomes. LearnSphere also supports research reproducibility, replicability, traceability, and transparency as researchers can share their data and analytic methods along with links to research papers. We demonstrate the capabilities of LearnSphere through a series of case studies that illustrate how analytic components can be combined into research workflow combinations that can be developed and shared. We also show how open web-accessible analytics drive the creation of common formats to streamline repeated analytics and facilitate wider and more flexible dissemination of analytic tool kits.
Session-based Methods for Course Recommendation
Md Akib Zabed Khan | Florida International University | mkhan149@fiu.edu | |
Agoritsa Polyzou | Florida International University | apolyzou@fiu.edu |
In higher education, academic advising is crucial to students’ decision-making. Data-driven models can benefit students in making informed decisions by providing insightful recommendations for completing their degrees. To suggest courses for the upcoming semester, various course recommendation models have been proposed in the literature using different data mining techniques and machine learning algorithms utilizing different data types. One important aspect of the data is that usually, courses taken together in a semester fit well with each other. If there is no correlation between the co-taken courses, students may find it more difficult to handle the workload. Based on this insight, we propose using session-based approaches to recommend a set of well-suited courses for the upcoming semester. We test three session-based course recommendation models, two based on neural networks (CourseBEACON and CourseDREAM) and one on tensor factorization (TF-CoC). Additionally, we propose a postprocessing approach to adjust the recommendation scores of any base course recommender to promote related courses. Using metrics capturing different aspects of the recommendation quality, our experimental evaluation shows that session-based methods outperform existing popularity-based, association-based, similarity-based, factorization-based, neural networks-based, and Markov chain-based recommendation approaches. Effective course recommendations can result in improved student advising, which, in turn, can improve student performance, decrease dropout rates, and a more positive overall student experience and satisfaction.
Best Paper AIED 2023 Presentation
Confusion, Conflict, Consensus: Modeling Dialogue Processes During Collaborative Learning with Hidden Markov Models
Toni V. Earle-Randell | University of Florida |
Joseph B. Wiggins | University of Florida |
Julianna Martinez Ruiz | University of Florida |
Mehmet Celepkolu | University of Florida |
Kristy Elizabeth Boyer | University of Florida |
Collin F. Lynch | North Carolina State University |
Maya Israel | University of Florida |
Eric Wiebe | North Carolina State University |
There is growing recognition that AI technologies can, and should, support collaborative learning. To provide this support, we need models of collaborative talk that reflect the ways in which learners interact. Great progress has been made in modeling dialogue for high school and college-age learners, but the dialogue processes that characterize collaborative talk between elementary learner dyads are not currently well understood. This paper reports on a study with elementary school learners (4th and 5th grade, ages 9–11 years old) coded collaboratively in dyads. We recorded dialogue from 22 elementary school learner dyads, covering 7594 total utterances. We labeled this corpus manually with dialogue acts and then induced a hidden Markov model to identify the underlying dialogue states and the transitions between these states. The model identified six distinct hidden states which we interpret as Social Dialogue, Confusion, Frustrated Coordination, Exploratory Talk, Directive & Disagreement, and Disagreement & Self-Explanation. The HMM revealed that when students entered into a productive exploratory talk state, the primary way they transitioned out of this state is when they became confused or reached an impasse. When this occurred, the learners then moved into states of disputation and conflict before re-entering the Exploratory Talk state. These findings can inform the design of AI agents who support young learners’ collaborative talk and help agents determine when students are conflicting rather than collaborating.