TY - JOUR
T1 - Combined generative adversarial network and fuzzy C-means clustering for multi-class voice disorder detection with an imbalanced dataset
AU - Chui, Kwok Tai
AU - Lytras, Miltiadis D.
AU - Vasant, Pandian
N1 - Publisher Copyright:
© 2020 by the authors.
PY - 2020/7/1
Y1 - 2020/7/1
N2 - The world has witnessed the success of artificial intelligence deployment for smart healthcare applications. Various studies have suggested that the prevalence of voice disorders in the general population is greater than 10%. An automatic diagnosis for voice disorders via machine learning algorithms is desired to reduce the cost and time needed for examination by doctors and speech-language pathologists. In this paper, a conditional generative adversarial network (CGAN) and improved fuzzy c-means clustering (IFCM) algorithm called CGAN-IFCM is proposed for the multi-class voice disorder detection of three common types of voice disorders. Existing benchmark datasets for voice disorders, the Saarbruecken Voice Database (SVD) and the Voice ICar fEDerico II Database (VOICED), use imbalanced classes. A generative adversarial network offers synthetic data to reduce bias in the detection model. Improved fuzzy c-means clustering considers the relationship between adjacent data points in the fuzzy membership function. To explain the necessity of CGAN and IFCM, a comparison is made between the algorithm with CGAN and that without CGAN. Moreover, the performance is compared between IFCM and traditional fuzzy c-means clustering. Lastly, the proposed CGAN-IFCM outperforms existing models in its true negative rate and true positive rate by 9.9-12.9% and 9.1-44.8%, respectively.
AB - The world has witnessed the success of artificial intelligence deployment for smart healthcare applications. Various studies have suggested that the prevalence of voice disorders in the general population is greater than 10%. An automatic diagnosis for voice disorders via machine learning algorithms is desired to reduce the cost and time needed for examination by doctors and speech-language pathologists. In this paper, a conditional generative adversarial network (CGAN) and improved fuzzy c-means clustering (IFCM) algorithm called CGAN-IFCM is proposed for the multi-class voice disorder detection of three common types of voice disorders. Existing benchmark datasets for voice disorders, the Saarbruecken Voice Database (SVD) and the Voice ICar fEDerico II Database (VOICED), use imbalanced classes. A generative adversarial network offers synthetic data to reduce bias in the detection model. Improved fuzzy c-means clustering considers the relationship between adjacent data points in the fuzzy membership function. To explain the necessity of CGAN and IFCM, a comparison is made between the algorithm with CGAN and that without CGAN. Moreover, the performance is compared between IFCM and traditional fuzzy c-means clustering. Lastly, the proposed CGAN-IFCM outperforms existing models in its true negative rate and true positive rate by 9.9-12.9% and 9.1-44.8%, respectively.
KW - Artificial intelligence
KW - Fuzzy c-means clustering
KW - Generative adversarial network
KW - Imbalanced dataset
KW - Machine learning
KW - Multi-class detection
KW - Smart healthcare
KW - Synthetic data
KW - Voice disorders
UR - http://www.scopus.com/inward/record.url?scp=85087832634&partnerID=8YFLogxK
U2 - 10.3390/app10134571
DO - 10.3390/app10134571
M3 - Article
AN - SCOPUS:85087832634
VL - 10
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 13
M1 - 4571
ER -