TY - JOUR
T1 - Trends and features of the applications of natural language processing techniques for clinical trials text analysis
AU - Chen, Xieling
AU - Xie, Haoran
AU - Cheng, Gary
AU - Poon, Leonard K.M.
AU - Leng, Mingming
AU - Wang, Fu Lee
N1 - Publisher Copyright:
© 2020 by the authors.
PY - 2020/3/1
Y1 - 2020/3/1
N2 - Natural language processing (NLP) is an effective tool for generating structured information from unstructured data, the one that is commonly found in clinical trial texts. Such interdisciplinary research has gradually grown into a flourishing research field with accumulated scientific outputs available. In this study, bibliographical data collected from Web of Science, PubMed, and Scopus databases from 2001 to 2018 had been investigated with the use of three prominent methods, including performance analysis, science mapping, and, particularly, an automatic text analysis approach named structural topic modeling. Topical trend visualization and test analysis were further employed to quantify the effects of the year of publication on topic proportions. Topical diverse distributions across prolific countries/regions and institutions were also visualized and compared. In addition, scientific collaborations between countries/regions, institutions, and authors were also explored using social network analysis. The findings obtained were essential for facilitating the development of the NLP-enhanced clinical trial texts processing, boosting scientific and technological NLP-enhanced clinical trial research, and facilitating inter-country/region and inter-institution collaborations.
AB - Natural language processing (NLP) is an effective tool for generating structured information from unstructured data, the one that is commonly found in clinical trial texts. Such interdisciplinary research has gradually grown into a flourishing research field with accumulated scientific outputs available. In this study, bibliographical data collected from Web of Science, PubMed, and Scopus databases from 2001 to 2018 had been investigated with the use of three prominent methods, including performance analysis, science mapping, and, particularly, an automatic text analysis approach named structural topic modeling. Topical trend visualization and test analysis were further employed to quantify the effects of the year of publication on topic proportions. Topical diverse distributions across prolific countries/regions and institutions were also visualized and compared. In addition, scientific collaborations between countries/regions, institutions, and authors were also explored using social network analysis. The findings obtained were essential for facilitating the development of the NLP-enhanced clinical trial texts processing, boosting scientific and technological NLP-enhanced clinical trial research, and facilitating inter-country/region and inter-institution collaborations.
KW - Bibliometrics
KW - Clinical trials text
KW - Collaboration
KW - Natural language processing
KW - Structural topic modeling
UR - http://www.scopus.com/inward/record.url?scp=85082696456&partnerID=8YFLogxK
U2 - 10.3390/app10062157
DO - 10.3390/app10062157
M3 - Article
AN - SCOPUS:85082696456
VL - 10
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 6
M1 - 2157
ER -