Keynote Speakers

TITLE: Test of Time Award: Compassionate, Data-Driven Tutors for Problem Solving and Persistence

Abstract: Determining how, when, and whether to provide personalized support is a well-known challenge called the assistance dilemma. A core problem in solving the assistance dilemma is the need to discover when students are unproductive so that the tutor can intervene. This is particularly challenging for open-ended domains, even those that are well-structured with defined principles and goals. In this talk, I will present a set of data-driven methods to classify, predict, and prevent unproductive problem-solving steps in the well-structured open-ended domains of logic and programming. Our approaches leverage and extend my work on the Hint Factory, a set of methods that to build data-driven intelligent tutor supports using prior student solution attempts. In logic, we devised a HelpNeed classification model that uses prior student data to determine when students are likely to be unproductive and need help learning optimal problem-solving strategies. In a controlled study, we found that students receiving proactive assistance on logic when we predicted HelpNeed were less likely to avoid hints during training, and produced significantly shorter, more optimal posttest solutions in less time. In a similar vein, we have devised a new data-driven method that uses student trace logs to identify struggling moments during a programming assignment and determine the appropriate time for an intervention. We validated our algorithm’s classification of struggling and progressing moments with experts rating whether they believe an intervention is needed for a sample of 20% of the dataset. The result shows that our automatic struggle detection method can accurately detect struggling students with less than 2 minutes of work with 77% accuracy. We further evaluated a sample of 86 struggling moments, finding 6 reasons that human tutors gave for intervention from missing key components to needing confirmation and next steps. This research provides insight into the when and why for programming interventions. Finally, we explore the potential of what supports data-driven tutors can provide, from progress tracking to worked examples and encouraging messages, and their importance for compassionately promoting persistence in problem solving.

Bio: Dr. Tiffany Barnes is a Distinguished Professor of Computer Science at North Carolina State University, and a Distinguished Member of the Association of Computing Machinery (ACM). Prof. Barnes is Founding Co-Director of the STARS Computing Corps, a Broadening Participation in Computing Alliance funded by the U.S.A. National Science Foundation. Her internationally recognized research program focuses on transforming education with AI-driven learning games and technologies, and research on equity and broadening participation. Her current research ranges from investigations of intelligent tutoring systems and teacher professional development to foundational work on educational data mining, computational models of interactive problem-solving, and design of computational thinking curricula. Her personalized learning technologies and broadening participation programs have impacted thousands of K-20 students throughout the United States.

Title: Deep down, everyone wants to be causal

Abstract: Most researchers in the social, behavioral, and health sciences are taught to be extremely cautious in making causal claims. However, causal inference is a necessary goal in research for addressing many of the most pressing questions around policy and practice. In the past decade, causal methodologists have increasingly been using and touting the benefits of more complicated machine learning algorithms to estimate causal effects. These methods can take some of the guesswork out of analyses, decrease the opportunity for “p-hacking,” and may be better suited for more fine-tuned tasks such as identifying varying treatment effects and generalizing results from one population to another. However, should these more advanced methods change our fundamental views about how difficult it is to infer causality? In this talk, I will discuss some potential advantages and disadvantages of using machine learning for causal inference and emphasize ways that we can all be more transparent in our inferences and honest about their limitations.

Bio: Jennifer Hill develops and evaluates methods to help answer the types of causal questions that are vital to policy research and scientific development. In particular she focuses on situations where it is difficult or impossible to perform traditional randomized experiments, or when even seemingly pristine study designs are complicated by missing data or hierarchically structured data. In past 10 years Hill has been pursuing two intersecting strands of research. The first focuses on Bayesian nonparametric methods that allow for flexible estimation of causal models and are less time-consuming and more precise than competing methods (e.g. propensity score approaches). The second line of work pursues strategies for exploring the impact of violations of typical causal inference assumptions such as ignorability (all confounders measured) and common support (overlap). Hill has published in a variety of leading journals including Journal of the American Statistical Association, Statistical Science, American Political Science Review, American Journal of Public Health, and Developmental Psychology. Hill earned her PhD in Statistics at Harvard University in 2000 and completed a postdoctoral fellowship in Child and Family Policy at Columbia University’s School of Social Work in 2002.

Hill currently serves as the Director of the Center for Practice and Research at the Intersection of Information, Society, and Methodology (PRIISM) and Co-Director of and the Master’s of Science Program in Applied Statistics for Social Science Research (A3SR) at New York University. Her Wordle average is competitive, but not all she’d like it to be.

Title: Beyond Algorithmic Fairness in Education: Equitable and Inclusive Decision-Support Systems

**Dr René Kizilcec, Assistant Professor**

Abstract: Advancing equity and inclusion in schools and universities has long been a priority in education research. While data-driven predictive models could help improve social injustices in education, many studies from other domains suggest instead that these models tend to exacerbate existing inequities without added precautions. A growing body of research from the educational data mining and neighboring communities is beginning to map out where biases are likely to occur, what contributes to them, and how to mitigate them. These efforts to advance algorithmic fairness are an important research direction, but it is critical to also consider how AI systems are used in educational contexts to support decisions and judgements. In this talk, I will survey research on algorithmic fairness and explore the role of human factors in AI systems and their implications for advancing equity and inclusion in education.

Bio: Rene Kizilcec is an Assistant Professor of Information Science at Cornell University, where he directs the Future of Learning Lab. He studies the use and impact of technology in formal and informal learning environments (incl. college classes, online degree programs, mobile learning, professional development, MOOCs, and middle/high school classrooms) with behavioral, psychological, and computational science methods. His work on algorithmic fairness in education studies the implications of predictive models for equity and his research on scalable interventions to broaden participation and reduce achievement gaps has appeared in Science and PNAS. He served as general conference chair (2022) and program chair (2020) for Learning at Scale. Kizilcec received a BA in Philosophy and Economics from University College London, and a MSc in Statistics and PhD in Communication from Stanford University.

Title: No data about me without me’?: Including Learners and Teachers in Educational Data Mining

Abstract: The conference theme this year emphasises the broadening of participation and inclusion in educational data mining; in this talk, I will discuss methodologies for including learners and teachers throughout the research process. This involves not only preventing harm to young learners which might result from insufficient care when processing their data but also embracing their participation in the design and evaluation of educational data mining technologies. I will argue that even young learners can and should be included in the analysis and interpretation of data which affects them. I will give examples of a project in which children have the role of data activists, using classroom sensor data to explore their readiness to learn.

Bio: Judy Robertson is Professor of Digital Learning at the University of Edinburgh. She holds a BSc in Computer Science and Artificial Intelligence and a PhD in educational technology from the University of Edinburgh. An important theme throughout her work has been including children and young people in the design of technology and consulting them about socio-technical issues which affect them. Her very first research project at undergraduate level involved the development of an ITS called BetterBlether to help children develop discussion skills. A few years later she worked with a design team of six ten-year-olds to develop an ITS for story writing which featured a spatula-wielding animated squid. Since those halcyon days, she has continued using child-centred design methodologies to develop technology with and for children in education and healthcare domains. She is currently working on approaches to teacher education in computational thinking and data literacy and is the academic director of the Data Education in Schools project (https://dataschools.education/).