A novel hybrid model integrating MFCC and acoustic parameters for voice disorder detection

  • Vyom Verma
  • , Anish Benjwal
  • , Amit Chhabra
  • , Sunil K. Singh
  • , Sudhakar Kumar
  • , Brij B. Gupta
  • , Varsha Arya
  • , Kwok Tai Chui

Research output: Contribution to journalArticlepeer-review

43 Citations (Scopus)

Abstract

Voice is an essential component of human communication, serving as a fundamental medium for expressing thoughts, emotions, and ideas. Disruptions in vocal fold vibratory patterns can lead to voice disorders, which can have a profound impact on interpersonal interactions. Early detection of voice disorders is crucial for improving voice health and quality of life. This research proposes a novel methodology called VDDMFS [voice disorder detection using MFCC (Mel-frequency cepstral coefficients), fundamental frequency and spectral centroid] which combines an artificial neural network (ANN) trained on acoustic attributes and a long short-term memory (LSTM) model trained on MFCC attributes. Subsequently, the probabilities generated by both the ANN and LSTM models are stacked and used as input for XGBoost, which detects whether a voice is disordered or not, resulting in more accurate voice disorder detection. This approach achieved promising results, with an accuracy of 95.67%, sensitivity of 95.36%, specificity of 96.49% and f1 score of 96.9%, outperforming existing techniques.

Original languageEnglish
Article number22719
JournalScientific Reports
Volume13
Issue number1
DOIs
Publication statusPublished - Dec 2023

Fingerprint

Dive into the research topics of 'A novel hybrid model integrating MFCC and acoustic parameters for voice disorder detection'. Together they form a unique fingerprint.

Cite this