TY - JOUR
T1 - Context reinforced neural topic modeling over short texts
AU - Feng, Jiachun
AU - Zhang, Zusheng
AU - Ding, Cheng
AU - Rao, Yanghui
AU - Xie, Haoran
AU - Wang, Fu Lee
N1 - Publisher Copyright:
© 2022 Elsevier Inc.
PY - 2022/8
Y1 - 2022/8
N2 - As one of the prevalent topic mining methods, neural topic modeling has attracted a lot of interests due to the advantages of low training costs and strong generalisation abilities. However, the existing neural topic models may suffer from the feature sparsity problem when applied to short texts, due to the lack of context in each message. To alleviate this issue, we propose a Context Reinforced Neural Topic Model (CRNTM), whose characteristics can be summarized as follows. First, by assuming that each short text covers only a few salient topics, the proposed CRNTM infers the topic for each word in a narrow range. Second, our model exploits pre-trained word embeddings by treating topics as multivariate Gaussian distributions or Gaussian mixture distributions in the embedding space. Extensive experiments on two benchmark short corpora validate the effectiveness of the proposed model on both topic discovery and text classification.
AB - As one of the prevalent topic mining methods, neural topic modeling has attracted a lot of interests due to the advantages of low training costs and strong generalisation abilities. However, the existing neural topic models may suffer from the feature sparsity problem when applied to short texts, due to the lack of context in each message. To alleviate this issue, we propose a Context Reinforced Neural Topic Model (CRNTM), whose characteristics can be summarized as follows. First, by assuming that each short text covers only a few salient topics, the proposed CRNTM infers the topic for each word in a narrow range. Second, our model exploits pre-trained word embeddings by treating topics as multivariate Gaussian distributions or Gaussian mixture distributions in the embedding space. Extensive experiments on two benchmark short corpora validate the effectiveness of the proposed model on both topic discovery and text classification.
KW - Context reinforcement
KW - Neural topic model
KW - Short texts
UR - http://www.scopus.com/inward/record.url?scp=85131463860&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2022.05.098
DO - 10.1016/j.ins.2022.05.098
M3 - Article
AN - SCOPUS:85131463860
SN - 0020-0255
VL - 607
SP - 79
EP - 91
JO - Information Sciences
JF - Information Sciences
ER -