Which Hammer should I Use? A Systematic Evaluation of Approaches for Classifying Educational Forum Posts
Lele Sha, Mladen Rakovic, Alexander Whitelock-Wainwright, David Carroll, Dragan Gasevic, Guanliang Chen
Jul 01, 2021 14:30 UTC+2
—
Session D2
—
Zoom link
Keywords: Educational Forum Posts, Text Classification, Deep NeuralNetwork, Pre-trained Language Models
Abstract:
Classifying educational forum posts is a longstanding taskin the research of Learning Analytics and Educational DataMining. Though this task has been tackled by applyingboth traditional Machine Learning (ML) approaches (e.g.,Logistics Regression and Random Forest) and up-to-dateDeep Learning (DL) approaches, there lacks a systematicexamination of these two types of approaches to portraytheir performance difference. To better guide researchersand practitioners to select a model that suits their needsthe best, this study aimed to systematically compare theeffectiveness of these two types of approaches for this spe-cific task. Specifically, we selected a total of six repre-sentative models and explored their capabilities by equip-ping them with either extensive input features that werewidely used in previous studies (traditional ML models)or the state-of-the-art pre-trained language model BERT(DL models). Through extensive experiments on two real-world datasets (one is open-sourced), we demonstrated that:(i) DL models uniformly achieved better classification re-sults than traditional ML models and the performance dif-ference ranges from 1.85% to 5.32% with respect to differ-ent evaluation metrics; (ii) when applying traditional MLmodels, different features should be explored and engineeredto tackle different classification tasks; (iii) when applyingDL models, it tends to be a promising approach to adaptBERT to the specific classification task by fine-tuning itsmodel parameters. We have publicly released our code athttps://github.com/...Anonymous