Copula Guided Parallel Gibbs Sampling for Nonparametric and Coherent Topic Discovery (Extended Abstract)

Lihui Lin, Yanghui Rao, Haoran Xie, Raymond Y.K. Lau, Jian Yin, Fu Lee Wang, Qing Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In terms of the generative process, the Gamma-Gamma-Poisson Process (G2PP) is equivalent to the nonparametric topic model of Hierarchical Dirichlet Process (HDP). Considering the high computational cost of estimating parameters in HDP, a parallel G2PP was developed to generate topics efficiently via multi-threading. Unfortunately, the above model needs to predefine the number of topics. To address this issue, we first propose a Topic Self-Adaptive Model (TSAM) for nonparametric and parallel topic discovery. In TSAM, a monitor-executor mechanism is developed to manage the global topic information using a hierarchical structure of threads. Based on the apparatus of copulas, we further extend our TSAM to TSAMcop for coherent topic modeling by exploiting a copula guided parallel Gibbs sampling algorithm. Extensive experiments validate the effectiveness of both TSAM and TSAMcop.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
Pages3823-3824
Number of pages2
ISBN (Electronic)9798350322279
DOIs
Publication statusPublished - 2023
Event39th IEEE International Conference on Data Engineering, ICDE 2023 - Anaheim, United States
Duration: 3 Apr 20237 Apr 2023

Publication series

NameProceedings - International Conference on Data Engineering
Volume2023-April
ISSN (Print)1084-4627

Conference

Conference39th IEEE International Conference on Data Engineering, ICDE 2023
Country/TerritoryUnited States
CityAnaheim
Period3/04/237/04/23

Keywords

  • copulas
  • parallel gibbs sampling
  • topic modelling

Fingerprint

Dive into the research topics of 'Copula Guided Parallel Gibbs Sampling for Nonparametric and Coherent Topic Discovery (Extended Abstract)'. Together they form a unique fingerprint.

Cite this