TY - GEN
T1 - Neural mixed counting models for dispersed topic discovery
AU - Wu, Jiemin
AU - Rao, Yanghui
AU - Zhang, Zusheng
AU - Xie, Haoran
AU - Li, Qing
AU - Wang, Fu Lee
AU - Chen, Ziye
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Mixed counting models that use the negative binomial distribution as the prior can well model over-dispersed and hierarchically dependent random variables; thus they have attracted much attention in mining dispersed document topics. However, the existing parameter inference method like Monte Carlo sampling is quite time-consuming. In this paper, we propose two efficient neural mixed counting models, i.e., the Negative Binomial-Neural Topic Model (NB-NTM) and the Gamma Negative Binomial-Neural Topic Model (GNB-NTM) for dispersed topic discovery. Neural variational inference algorithms are developed to infer model parameters by using the reparameterization of Gamma distribution and the Gaussian approximation of Poisson distribution. Experiments on real-world datasets indicate that our models outperform state-of-the-art baseline models in terms of perplexity and topic coherence. The results also validate that both NB-NTM and GNB-NTM can produce explainable intermediate variables by generating dispersed proportions of document topics.
AB - Mixed counting models that use the negative binomial distribution as the prior can well model over-dispersed and hierarchically dependent random variables; thus they have attracted much attention in mining dispersed document topics. However, the existing parameter inference method like Monte Carlo sampling is quite time-consuming. In this paper, we propose two efficient neural mixed counting models, i.e., the Negative Binomial-Neural Topic Model (NB-NTM) and the Gamma Negative Binomial-Neural Topic Model (GNB-NTM) for dispersed topic discovery. Neural variational inference algorithms are developed to infer model parameters by using the reparameterization of Gamma distribution and the Gaussian approximation of Poisson distribution. Experiments on real-world datasets indicate that our models outperform state-of-the-art baseline models in terms of perplexity and topic coherence. The results also validate that both NB-NTM and GNB-NTM can produce explainable intermediate variables by generating dispersed proportions of document topics.
UR - http://www.scopus.com/inward/record.url?scp=85099540680&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85099540680
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 6159
EP - 6169
BT - ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
T2 - 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
Y2 - 5 July 2020 through 10 July 2020
ER -