ALL-IN-ONE: Multi-Task Learning BERT models for Evaluating Peer Assessments
Abstract: Peer assessment has been widely applied across diverse academic fields over the last few decades, and has demonstrated its effectiveness. However, advantages of peer assessment can only be achieved with high-quality peer reviews. Previous studies have found that high-quality reviews usually comprise several features (e.g. contain suggestions, mention problems, use positive tone, etc.). Thus, researchers have attempted to evaluate peer reviews by detecting different features using various machine learning and deep learning models. However, no single study exists which investigates using a multi-task learning (MTL) model to detect multiple features simultaneously. This paper presents two MTL models for evaluating peer reviews by leveraging state-of-the-art pre-trained language representation models BERT and DistilBERT. Our results demonstrate that BERT-based models significantly outperform previous GloVe-based methods by around 6% in F1-score on tasks of detecting a single feature, and MTL further improves performance while reducing model size.