TY - JOUR
T1 - Improving text classification via a soft dynamical label strategy
AU - Wang, Jingjing
AU - Xie, Haoran
AU - Wang, Fu Lee
AU - Lee, Lap Kei
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2023/7
Y1 - 2023/7
N2 - Labels play a central role in the text classification tasks. However, most studies has a lossy label encoding problem, in which the label will be represented by a meaningless and independent one-hot vector. This paper proposes a novel strategy to dynamically generate a soft pseudo label based on the prediction for each training. This history-based soft pseudo label will be taken as the target to optimize parameters by minimizing the distance between the target and the prediction. In addition, we augment the training data with Mix-up, a widely used method, to prevent overfitting on the small dataset. Extensive experimental results demonstrate that the proposed dynamical soft label strategy significantly improves the performance of several widely used deep learning classification models on binary and multi-class text classification tasks. Not only is our simple and efficient strategy much easier to implement and train, it is also exhibits substantial improvements (up to 2.54% relative improvement on FDCNews datasets with an LSTM encoder) over Label Confusion Learning (LCM)—a state-of-the-art label smoothing model—under the same experimental setting. The experimental result also demonstrate that Mix-up improves our method's performance on smaller datasets, but introduce excess noise in larger datasets, which diminishes the model’s performance.
AB - Labels play a central role in the text classification tasks. However, most studies has a lossy label encoding problem, in which the label will be represented by a meaningless and independent one-hot vector. This paper proposes a novel strategy to dynamically generate a soft pseudo label based on the prediction for each training. This history-based soft pseudo label will be taken as the target to optimize parameters by minimizing the distance between the target and the prediction. In addition, we augment the training data with Mix-up, a widely used method, to prevent overfitting on the small dataset. Extensive experimental results demonstrate that the proposed dynamical soft label strategy significantly improves the performance of several widely used deep learning classification models on binary and multi-class text classification tasks. Not only is our simple and efficient strategy much easier to implement and train, it is also exhibits substantial improvements (up to 2.54% relative improvement on FDCNews datasets with an LSTM encoder) over Label Confusion Learning (LCM)—a state-of-the-art label smoothing model—under the same experimental setting. The experimental result also demonstrate that Mix-up improves our method's performance on smaller datasets, but introduce excess noise in larger datasets, which diminishes the model’s performance.
KW - Label distribution learning
KW - Mix-up
KW - Text classification
UR - http://www.scopus.com/inward/record.url?scp=85146622970&partnerID=8YFLogxK
U2 - 10.1007/s13042-022-01770-w
DO - 10.1007/s13042-022-01770-w
M3 - Article
AN - SCOPUS:85146622970
SN - 1868-8071
VL - 14
SP - 2395
EP - 2405
JO - International Journal of Machine Learning and Cybernetics
JF - International Journal of Machine Learning and Cybernetics
IS - 7
ER -