Deep learning for sentence clustering in essay grading support
Abstract: Essays test student knowledge on a deeper level than short-answer and multiple-choice questions but are more laborious to evaluate. Automatic clustering of essays, or their fragments, prior to evaluation may reduce the manual effort required. Such clustering presents numerous challenges due to the variability and ambiguity of natural language. In this paper, we introduce two datasets of undergraduate student essays in Finnish, manually annotated for salient arguments on the sentence level. Using these datasets, we evaluate several deep-learning embedding methods for their suitability to sentence clustering in support of essay grading. We find the suitable method choice to depend on the nature of the exam question and the answers, with deep-learning methods being capable of, but not guaranteeing better performance over simpler methods based on lexical overlap.