TY - JOUR
T1 - Application of a deep learning algorithm for the diagnosis of HCC
AU - Yu, Philip Leung Ho
AU - Chiu, Keith Wan Hang
AU - Lu, Jianliang
AU - Lui, Gilbert C.S.
AU - Zhou, Jian
AU - Cheng, Ho Ming
AU - Mao, Xianhua
AU - Wu, Juan
AU - Shen, Xin Ping
AU - Kwok, King Ming
AU - Kan, Wai Kuen
AU - Ho, Y. C.
AU - Chan, Hung Tat
AU - Xiao, Peng
AU - Mak, Lung Yi
AU - Tsui, Vivien W.M.
AU - Hui, Cynthia
AU - Lam, Pui Mei
AU - Deng, Zijie
AU - Guo, Jiaqi
AU - Ni, Li
AU - Huang, Jinhua
AU - Yu, Sarah
AU - Peng, Chengzhi
AU - Li, Wai Keung
AU - Yuen, Man Fung
AU - Seto, Wai Kay
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2025/1
Y1 - 2025/1
N2 - Background & Aims: Hepatocellular carcinoma (HCC) is characterized by a high mortality rate. The Liver Imaging Reporting and Data System (LI-RADS) results in a considerable number of indeterminate observations, rendering an accurate diagnosis difficult. Methods: We developed four deep learning models for diagnosing HCC on computed tomography (CT) via a training–validation–testing approach. Thin-slice triphasic CT liver images and relevant clinical information were collected and processed for deep learning. HCC was diagnosed and verified via a 12-month clinical composite reference standard. CT observations among at-risk patients were annotated using LI-RADS. Diagnostic performance was assessed by internal validation and independent external testing. We conducted sensitivity analyses of different subgroups, deep learning explainability evaluation, and misclassification analysis. Results: From 2,832 patients and 4,305 CT observations, the best-performing model was Spatio-Temporal 3D Convolution Network (ST3DCN), achieving area under receiver-operating-characteristic curves (AUCs) of 0.919 (95% CI, 0.903–0.935) and 0.901 (95% CI, 0.879–0.924) at the observation (n = 1,077) and patient (n = 685) levels, respectively during internal validation, compared with 0.839 (95% CI, 0.814–0.864) and 0.822 (95% CI, 0.790–0.853), respectively for standard of care radiological interpretation. The negative predictive values of ST3DCN were 0.966 (95% CI, 0.954–0.979) and 0.951 (95% CI, 0.931–0.971), respectively. The observation-level AUCs among at-risk patients, 2–5-cm observations, and singular portovenous phase analysis of ST3DCN were 0.899 (95% CI, 0.874–0.924), 0.872 (95% CI, 0.838–0.909) and 0.912 (95% CI, 0.895–0.929), respectively. In external testing (551/717 patients/observations), the AUC of ST3DCN was 0.901 (95% CI, 0.877–0.924), which was non-inferior to radiological interpretation (AUC 0.900; 95% CI, 0.877–-923). Conclusions: ST3DCN achieved strong, robust performance for accurate HCC diagnosis on CT. Thus, deep learning can expedite and improve the process of diagnosing HCC. Impact and implications: The clinical applicability of deep learning in HCC diagnosis is potentially huge, especially considering the expected increase in the incidence and mortality of HCC worldwide. Early diagnosis through deep learning can lead to earlier definitive management, particularly for at-risk patients. The model can be broadly deployed for patients undergoing a triphasic contrast CT scan of the liver to reduce the currently high mortality rate of HCC.
AB - Background & Aims: Hepatocellular carcinoma (HCC) is characterized by a high mortality rate. The Liver Imaging Reporting and Data System (LI-RADS) results in a considerable number of indeterminate observations, rendering an accurate diagnosis difficult. Methods: We developed four deep learning models for diagnosing HCC on computed tomography (CT) via a training–validation–testing approach. Thin-slice triphasic CT liver images and relevant clinical information were collected and processed for deep learning. HCC was diagnosed and verified via a 12-month clinical composite reference standard. CT observations among at-risk patients were annotated using LI-RADS. Diagnostic performance was assessed by internal validation and independent external testing. We conducted sensitivity analyses of different subgroups, deep learning explainability evaluation, and misclassification analysis. Results: From 2,832 patients and 4,305 CT observations, the best-performing model was Spatio-Temporal 3D Convolution Network (ST3DCN), achieving area under receiver-operating-characteristic curves (AUCs) of 0.919 (95% CI, 0.903–0.935) and 0.901 (95% CI, 0.879–0.924) at the observation (n = 1,077) and patient (n = 685) levels, respectively during internal validation, compared with 0.839 (95% CI, 0.814–0.864) and 0.822 (95% CI, 0.790–0.853), respectively for standard of care radiological interpretation. The negative predictive values of ST3DCN were 0.966 (95% CI, 0.954–0.979) and 0.951 (95% CI, 0.931–0.971), respectively. The observation-level AUCs among at-risk patients, 2–5-cm observations, and singular portovenous phase analysis of ST3DCN were 0.899 (95% CI, 0.874–0.924), 0.872 (95% CI, 0.838–0.909) and 0.912 (95% CI, 0.895–0.929), respectively. In external testing (551/717 patients/observations), the AUC of ST3DCN was 0.901 (95% CI, 0.877–0.924), which was non-inferior to radiological interpretation (AUC 0.900; 95% CI, 0.877–-923). Conclusions: ST3DCN achieved strong, robust performance for accurate HCC diagnosis on CT. Thus, deep learning can expedite and improve the process of diagnosing HCC. Impact and implications: The clinical applicability of deep learning in HCC diagnosis is potentially huge, especially considering the expected increase in the incidence and mortality of HCC worldwide. Early diagnosis through deep learning can lead to earlier definitive management, particularly for at-risk patients. The model can be broadly deployed for patients undergoing a triphasic contrast CT scan of the liver to reduce the currently high mortality rate of HCC.
KW - AI
KW - CT
KW - HCC
KW - Imaging
KW - LIRADS
KW - Liver cancer
UR - https://www.scopus.com/pages/publications/85210546214
U2 - 10.1016/j.jhepr.2024.101219
DO - 10.1016/j.jhepr.2024.101219
M3 - Article
AN - SCOPUS:85210546214
VL - 7
JO - JHEP Reports
JF - JHEP Reports
IS - 1
M1 - 101219
ER -