DiVATM: Unsupervised Neural Topic Modeling using Disentangled Variational Autoencoders

  • Sudhakar Kumar
  • , Sunil K. Singh
  • , Saket Sarin
  • , Arun Dubey
  • , Mukesh Kumar
  • , Kwok Tai Chui
  • , Brij B. Gupta

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Topic modeling has been pivotal in NLP for extracting semantic structures from text corpora. Traditional methods like LDA often struggle with coherence and diversity. We propose DiVATM (Disentangled Variational Autoencoder for Topic Modeling), a neural architecture using VAEs with disentangled latent representations. The DiVATM encoder-decoder framework captures the semantic structure and reconstructs documents from disentangled variables using β -TCVAE, improving interpretability and coherence. Extensive experiments on 20 newsgroups and Reuters-21578 show DiVATM outperforms state-of-the-art models in perplexity and topic coherence. DiVATM achieves a perplexity of 150.2 on 20-Newsgroups and 85.7 on Reuters-21578, with coherence scores of 0.75 and 0.82, respectively. Qualitative evaluations reveal DiVATM generates more distinct and interpretable topics. Ablation studies confirm β-TCVAE contributes to a 20% increase in topic diversity. DiVATM advances unsupervised topic modeling, offering a robust framework for future neural representation learning research.

Original languageEnglish
Title of host publication2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025
ISBN (Electronic)9798331522780
DOIs
Publication statusPublished - 2025
Event2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025 - Hybrid, Surakarta, Indonesia
Duration: 3 Jun 20254 Jun 2025

Publication series

Name2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025

Conference

Conference2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025
Country/TerritoryIndonesia
CityHybrid, Surakarta
Period3/06/254/06/25

Keywords

  • disentangled variational autoencoders (VAEs)
  • latent variable disentanglement
  • natural language processing
  • unsupervised topic modeling
  • βTCVAE

Fingerprint

Dive into the research topics of 'DiVATM: Unsupervised Neural Topic Modeling using Disentangled Variational Autoencoders'. Together they form a unique fingerprint.

Cite this