TY - JOUR
T1 - A Swin Transformer based on multi-directional-shift window attention and inductive bias for diagnosis of pleural effusion
AU - Tian, Zekun
AU - Peng, Dunlu
AU - Wang, Debby D.
AU - Zhang, Linna
AU - Zou, Zheng
AU - Huang, Hejing
AU - Zhang, Shiqi
N1 - Publisher Copyright:
© 2025
PY - 2025/6
Y1 - 2025/6
N2 - In the field of healthcare, deep learning has shown promise in addressing diagnostic challenges. However, existing methods often struggle with generalization due to overfitting on non-discriminative features and limited datasets. To address these limitations, Ultra-Multi-SWIN is introduced as a novel deep learning model for pleural effusion diagnosis using ultrasound images. The model incorporates physician-inspired inductive biases into its architecture, enabling it to focus on discriminative features while avoiding overfitting to irrelevant information. Specifically, a multi-directional-shift window structure captures spatial features dependent on direction, and a MASK-based masking module suppresses redundant non-ultrasound features. A dataset comprising 50 subjects and four levels of pleural effusion severity (large, moderate, small, none) is established to evaluate the model's performance. Experimental results demonstrate that Ultra-Multi-SWIN achieves state-of-the-art performance, with average accuracies of 0.988 (subject-dependent) and 0.952 (subject-independent). Visualization and ablation studies further confirm the model's ability to generalize effectively by focusing on clinically relevant regions. The open-source code is released at Ultra-Multi-SWIN, promoting broader adoption and future research.
AB - In the field of healthcare, deep learning has shown promise in addressing diagnostic challenges. However, existing methods often struggle with generalization due to overfitting on non-discriminative features and limited datasets. To address these limitations, Ultra-Multi-SWIN is introduced as a novel deep learning model for pleural effusion diagnosis using ultrasound images. The model incorporates physician-inspired inductive biases into its architecture, enabling it to focus on discriminative features while avoiding overfitting to irrelevant information. Specifically, a multi-directional-shift window structure captures spatial features dependent on direction, and a MASK-based masking module suppresses redundant non-ultrasound features. A dataset comprising 50 subjects and four levels of pleural effusion severity (large, moderate, small, none) is established to evaluate the model's performance. Experimental results demonstrate that Ultra-Multi-SWIN achieves state-of-the-art performance, with average accuracies of 0.988 (subject-dependent) and 0.952 (subject-independent). Visualization and ablation studies further confirm the model's ability to generalize effectively by focusing on clinically relevant regions. The open-source code is released at Ultra-Multi-SWIN, promoting broader adoption and future research.
KW - Deep learning
KW - Pleural effusion
KW - Swin Transformer
KW - Ultrasonography
UR - https://www.scopus.com/pages/publications/105004652147
U2 - 10.1016/j.asoc.2025.113146
DO - 10.1016/j.asoc.2025.113146
M3 - Article
AN - SCOPUS:105004652147
SN - 1568-4946
VL - 177
JO - Applied Soft Computing Journal
JF - Applied Soft Computing Journal
M1 - 113146
ER -