Automatic classification of Chinese programming MOOC reviews using fine-tuned BERTs and GPT-augmented data

Xieling Chen, Haoran Xie, Di Zou, Lingling Xu, Fu Lee Wang

Research output: Contribution to journalArticlepeer-review

Abstract

In massive open online course (MOOC) environments, computer-based analysis of course reviews enables instructors and course designers to develop intervention strategies and improve instruction to support learners’ learning. This study aimed to automatically and effectively identify learners’ concerned topics within their written reviews. First, we examined the distribution of topics in 13,660 reviews related to a Chinese programming MOOC and identified “instructional skills,” “perceived course value,” “instructor characteristics,” and “perceived course difficulty” as primary concerns among learners. Second, we proposed a GPTaug-BERT model that integrates fine-tuned bidirectional encoder representations from Transformers (BERT) models with augmented data generated using generative pre-trained Transformers (GPT) and applied it to classify learners’ concerned topics automatically. Results showed that compared with machine learning and other deep learning architectures, the GPTaug-BERT model improved the F1 scores of the MOOC review topic recognition task by 7%. Third, we compared the effectiveness of the GPTaug-BERT model with the BERT-Chinese model in distinguishing between topics, showing that the GPTaug-BERT model achieved better performance with an accuracy of above 67% across all categories even for “online programming tools,” “feedback and problemsolving,” and “course structure” that were largely misclassified by the BERT-Chinese model. Findings offer insights into the effectiveness of combining fine-tuned BERT models with GPT-augmented data for facilitating accurate topic identification from MOOC reviews.

Original languageEnglish
Pages (from-to)230-249
Number of pages20
JournalEducational Technology and Society
Volume28
Issue number1
DOIs
Publication statusPublished - 2025

Keywords

  • BERT
  • Data augmentation
  • GPT
  • Massive open online courses
  • Multilabel classification

Fingerprint

Dive into the research topics of 'Automatic classification of Chinese programming MOOC reviews using fine-tuned BERTs and GPT-augmented data'. Together they form a unique fingerprint.

Cite this