Multimodal Deep Learning for Rumor Detection on Social Media

Project: Research

Project Details


Social media has become the major channel for people to access information and share their daily life, opinions, emotions, etc., to the public. Social media simplifies the way of information spread, which is convenient and does not have temporal and spatial restrictions. Therefore, massive amounts of information are spreading online, among which some of them may not always be true, e.g., rumors, fake news, etc. The false information may be propagated more frequently and faster compared with the true information with the help of social media. The spread of rumors has a negative impact and even may dominate public opinions.

Rumor detection aims to identify the false or inaccurate information online, i.e., the rumors, which are defined by the social psychology literature as a story or a statement whose truth value is unverified or deliberately false. In this project, we aim to detect rumors from online social media by exploiting the multimodal data, such as text, images, etc. The necessity and significance of dealing with or exploiting multimodal data are two-fold. Firstly, multimodal data is usually exploited compressively to attract readers and to tell better stories. Secondly, the complementary property is naturally among the data in multiple modalities, thus benefiting to the understanding the story comprehensively. Thirdly, the difference of discriminative power between text and images has been ignored, as text usually are relatively easier for machines to extract semantic information. Finally, the approaches for known events data may not be applicable to the data with new events.

In terms of dealing with multimodal data, there are two critical challenges. The first is to bridge the “semantic gap” among the heterogeneous data modalities, making machine learning models difficult to be used directly. The second is modeling the underlying relations among the data modalities. In the context of rumor detection, quite a few works have been proposed fusing the heterogeneous modalities which may suffer from a few deficiencies. Firstly, quite a few works manually extract features, which requires a fair amount of manpower, and the features may be limited in terms of expressive abilities. Secondly, recent deep learning models simply concatenate textual and visual features in a neural network framework, while neglecting the complementary relationship between the different modalities.

To this end, we aim to design self-attentive fusion mechanism to fuse the multimodal data in feature level, which assigns corresponding weights to the complementary modalities, in order to bridge the gap among the modalities. In order to model their underlying relations, we will train modality-specific networks jointly by introducing distance constraint to model the semantic distance among the heterogeneous data modalities. In particular, considering the event characteristics of rumors, we plan to introduce latent topic memory network to store the shared topics among rumor and non-rumor events, based on which each coming post can find its similar topic, benefiting to the identification of rumors.
Effective start/end date1/01/2231/12/24


  • Faculty Development Scheme: HK$1,248,950.00


  • Social Media Mining
  • Deep Learning
  • Multimodal Fusion
  • Memory Network
  • Rumor Detection


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.