Learning student program embeddings using abstract execution traces

Guillaume Cleuziou; Frédéric Flouvat

Learning student program embeddings using abstract execution traces

Guillaume Cleuziou, Frédéric Flouvat

Jun 30, 2021 19:50 UTC+2 — Session C1 — Zoom link

Keywords: Representation Learning, Program Embeddings, Feedback propagation, Neural Networks, Educational Data Mining, Computer Science Education, doc2vec

Abstract Paper Ask a question

Abstract: Improving the pedagogical effectiveness of programming training platforms is a hot topic that requires the construction of fine and exploitable representations of learners' programs. This article presents a new approach for learning program embeddings. Starting from the hypothesis that the functionality of a program, but also its "style", can be captured by analyzing its execution traces, the code2aes2vec method proceeds in two steps. A first step generates abstract execution sequences (AES) from predefined test cases and abstract syntax trees (AST) of the submitted programs. The doc2vec method is then used to learn condensed vector representations (embeddings) of the programs from these AESs. Experiments performed on real data sets shows that the embeddings generated by code2aes2vec efficiently capture both the semantics and the style of the programs. Finally, we show the relevance of the program embeddings thus generated on the task of automatic feedback propagation as a proof of concept.

Learning student program embeddings using abstract execution traces

Guillaume Cleuziou, Frédéric Flouvat

Video

Poster