9th Educational Data Mining in Computer Science Education (CSEDM) Workshop
Bita Akram
North Carolina State University
bakram@ncsu.edu
Yang Shi
Utah State University
yang.shi@usu.edu
Peter Brusilovsky
University of Pittsburgh
peterb@pitt.edu
Thomas W. Price
North Carolina State University
twprice@ncsu.edu
Kenneth R. Koedinger
Carnegie Mellon University
koedinger@cmu.edu
Paulo Carvalho
Carnegie Mellon University
pcarvalh@cs.cmu.edu
Shan Zhang
University of Florida
zhangshan@ufl.edu
Andrew Lan
University of Massachusetts
andrewlan@cs.umass.edu
Juho Leinonen
Aalto University
juho.2.leinonen@aalto.fi

ABSTRACT

There is a growing community of researchers at the intersection of data mining, AI, and computing education research. The objective of the CSEDM workshop is to facilitate a discussion among this research community, with a focus on how data mining can be uniquely applied in computing education research. For example, what new techniques are needed to analyze program code and CS log data? How do results from CS education inform our analysis of this data? The workshop is meant to be an interdisciplinary event at the intersection of EDM and Computing Education Research. Researchers, faculty, and students are encouraged to share their AI- and data-driven approaches, methodologies, and experiences where data transforms how students learn Computer Science (CS) skills. This full-day workshop will feature paper presentations and discussions to promote collaboration.

Keywords

Computer Science Education, Educational Data Mining, AI in Education, Learning Analytics

1. WORKSHOP GOALS

Computing is an increasingly fundamental skill for students across disciplines. It enables them to solve complex, real, and challenging problems and make a positive impact on the world. Yet, the field of computing education is still facing a range of problems, from high failure and attrition rates to challenges in training and recruiting teachers to the under-representation of women and students of color.

Advanced learning technologies, which use data and AI to improve student learning outcomes, have the potential to address these problems. However, the domain of CS education presents novel challenges for applying these techniques. CS presents domain-specific challenges, such as helping students effectively use tools like compilers and debuggers and supporting complex, open-ended problems with many possible solutions. CS also offers unique opportunities for developing learning technologies, such as abundant and rich log data, including code traces that capture each detail of how students’ solutions evolved.

These domain-specific challenges and opportunities suggest the need for a specialized community of researchers working at the intersection of AI, data mining, and computing education research. The goal of this Educational Data Mining for Computer Science Education (CSEDM) is to bring this community together to share insights for supporting and understanding learning in the domain of CS using data. This field is nascent but growing, with research in computing education increasingly using data analysis approaches and researchers in the EDM community increasingly studying CS datasets. This workshop will help these researchers learn from each other and develop the growing sub-field of CSEDM.

The workshop will build on eight successful prior CSEDM workshops at:

Each of these workshops was productive and well-attended. Our past in-person workshops have been well attended, and our virtual events have had over 100 people registered and over 70 simultaneous attendees! The proceedings were published in CEUR9 and Zenodo10.

The CSEDM workshop is funded by the CS-SPLICE project11. The CSEDM workshop will serve as a hub for researchers in the EDM community to discuss potential collaborations and identify EDM challenges in computing education. We plan to provide need-based funding support for participants, covering the cost of lodging on the workshop day and the registration fees.

2. RELEVANT TOPICS

The workshop encourages contributions from the following topics of interest:

We will invite researchers who are interested in further exploring, contributing, collaborating, and developing data- and AI-driven techniques for building educational tools for Computer Science to submit papers on any of these topics.

3. WORKSHOP ORGANIZATION

The workshop will be organized by a team with a history of CSEDM research:

Bita Akram is an Assistant Professor with the Department of Computer Science at North Carolina State University. Her research lies at the intersection of artificial intelligence and advanced learning technologies with its application on improving access and quality of CS Education. She has been actively developing data-driven approaches for assessing students’ CS competencies as demonstrated through their interactions with educational programming activities. She has served as the organizer and program committee for venues focused on educational data mining including EDM and CSEDM.

Yang Shi is an Assistant Professor at Utah State University. He has been working towards building data-driven methods for representing program code to enhance the ability of Intelligent Tutoring Systems and benefit student modeling processes for computing education. With a focus on DM/ML approaches applied to CS education, his research interests also include Programming Language Processing, Software Analysis, and Deep Learning. He has been serving as a program committee (PC) member in conferences across multiple disciplines, including EDM, LAK, KDD, AAAI, EAAI, SIGCSE, NEURIPS, and ITICSE.

Peter Brusilovsky is a Professor of Information Science and Intelligent Systems at the University of Pittsburgh, where he also directs the Personalized Adaptive Web Systems (PAWS) lab. He has been working in the field of adaptive educational systems, user modeling, and intelligent user interfaces for more than 30 years. He published numerous papers and edited several books on adaptive hypermedia and the adaptive Web. He is a founder of CS-SPLICE and has advanced research and infrastructure for CSEDM.

Thomas Price is an Associate Professor of Computer Science at North Carolina State University. His primary research goal is to develop learning environments that automatically support students through AI and data-driven help features. His work has focused on the domain of computing education, where he has developed techniques for automatically generating programming hints and feedback for students in real-time by leveraging student data. He has helped organized a number of efforts at the intersection of AIED, Data Mining and CS Education, including the CS-SPLICE working group on programming snapshot representation and prior CSEDM and CS-SPLICE workshops.

Juho Leinonen is an Academy Research Fellow at Aalto University. His research focuses on creating better insight into students’ learning with fine-grained learning analytics; using educational technology and artificial intelligence for personalizing course content; and using learnersourcing to create ample learning opportunities for distinct student needs. He has served on the program committee of both computing education focused and educational data mining focused conferences.

Ken Koedinger is the Hillman Professor of Computer Science with appointments in Human Computer Interaction and Psychology at Carnegie Mellon University. He focuses on understanding human learning processes and designing educational technologies to enhance student achievement. Dr. Koedinger has authored over 350 peer-reviewed publications and led over 45 funded research projects. He co-founded Carnegie Learning and his Cognitive Tutor technology, used in thousands of schools, has significantly increased student learning outcomes. He directs multiple infrastructure and educational projects, including LearnLab.org, DataShop.org, LearnSphere.org, and tutors.plus.

Paulo Carvalho is an Assistant Professor in the Human-Computer Interaction Institute at Carnegie Mellon University. His research explores how AI can revolutionize learning by creating engaging, practice-first environments. He uses data analytics and computational modeling to understand student learning, motivation, and meta-cognition and develop precise models for better learning experiences. He’s currently investigating how generative AI can power these practice-focused approaches, boosting engagement and freeing teachers to provide personalized support.

Shan Zhang is a PhD student in the educational technology program at the University of Florida. Before that, she gained her Ed.M. degree from Harvard University. Her research focuses on multimodal learning analytics, educational data mining, and AI in education and AI education. Shan’s recent work explores integrating AI into K-12 education, applying multimodal learning analytics and natural language processing (NLP) techniques to analyze collaborative learning features and affect in computer science, and math learning environments, and developing learner models.

Andrew (Shiting) Lan is an Assistant Professor in the Manning College of Information and Computer Sciences, University of Massachusetts Amherst. Before that, he was a postdoctoral research associate in the EDGE Lab at the Department of Electrical Engineering, Princeton University, and received his M.S. and Ph.D. degrees in Electrical and Computer Engineering in May 2014 and May 2016, respectively, from the Digital Signal Processing (DSP) group at Rice University. His research focuses on the development of artificial intelligence (AI) and especially natural language processing (NLP) methods to enable scalable and effective personalized learning in education, covering areas such as learner modeling, personalization, content generation, and human-in-the-loop AI.

3.1 Program Committee

The 9th CSEDM Workshop’s program committee will draw from members of prior program committees, including:

4. CALL FOR PARTICIPATION

We will solicit three types of research contributions:

8-page Research Papers: Original, unpublished work, addressing any of the topics of interest above.

6-page Position Papers or Work-in-progress Papers:

2-page Descriptions of CS Tools/Datasets/Infrastructure: Researchers will present their work at CSEDM in a conversational format. Presentations might include:

4.1 Timeline

The CFP will be released as soon as the workshop is accepted. An approximate timeline is as follows:

5. WORKSHOP ACTIVITIES

The workshop will be a full day workshop. It will primarily consist of paper presentations and discussions to facilitate collaboration. Interactive sessions include multiple parallel, short presentations, where participants can float around to the presentations they are interested in, similar to a poster session.

A tentative schedule is as follows:

6. SOLICITATION PLAN

Building on our growing network of contributors to prior workshops, we intend to solicit participation on the workshop through the following mailing lists and research networks:

We will also reach out to prior contributors to CSEDM Workshops to solicit additional submissions.

1http://sites.google.com/asu.edu/csedm-ws-edm-2018/

2http://sites.google.com/asu.edu/csedm-ws-lak-2019/

3http://sites.google.com/asu.edu/csedm-ws-aied-2019/

4http://sites.google.com/ncsu.edu/csedm-ws-edm-2020/

5http://sites.google.com/ncsu.edu/csedm-workshop-edm21/

6http://sites.google.com/ncsu.edu/csedm-workshop-edm22/

7http://sites.google.com/ncsu.edu/csedm-workshop-lak23/

8https://sites.google.com/view/csedm-workshop-edm24/

9Proceedings: 2021, 2024

10Proceedings: 2022, 2023

11http://cssplice.github.io/


© 2025 Copyright is held by the author(s). This work is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.