Learning student program embeddings using abstract execution traces
Abstract: Improving the pedagogical effectiveness of programming training platforms is a hot topic that requires the construction of fine and exploitable representations of learners' programs. This article presents a new approach for learning program embeddings. Starting from the hypothesis that the functionality of a program, but also its "style", can be captured by analyzing its execution traces, the code2aes2vec method proceeds in two steps. A first step generates abstract execution sequences (AES) from predefined test cases and abstract syntax trees (AST) of the submitted programs. The doc2vec method is then used to learn condensed vector representations (embeddings) of the programs from these AESs. Experiments performed on real data sets shows that the embeddings generated by code2aes2vec efficiently capture both the semantics and the style of the programs. Finally, we show the relevance of the program embeddings thus generated on the task of automatic feedback propagation as a proof of concept.