TY - JOUR
T1 - Lost in UNet
T2 - Improving Infrared Small Target Detection by Underappreciated Local Features
AU - Quan, Wuzhou
AU - Zhao, Wei
AU - Wang, Weiming
AU - Xie, Haoran
AU - Lee Wang, Fu
AU - Wei, Mingqiang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2025
Y1 - 2025
N2 - Infrared small target detection (ISTD) is a challenging task due to the low contrast and small size of the targets, which are often affected by complex backgrounds. UNet and its variants, known for their encoder-decoder structures, are widely used in such tasks since they can capture both local and global features. However, a significant drawback of UNet-based networks is the irreversible loss of crucial local features during downsampling, leading to missed detections and false positives, especially for small targets. Compared to other architectures like feature pyramid networks, UNet provides a more symmetric and efficient structure, allowing it to handle dense pixel-wise predictions effectively. However, standard UNet models still struggle to fully retain small target details, motivating the need for further improvements. To address this issue, we propose HintU, a novel network to recover the local features lost by various UNet-based methods for effective ISTD. HintU has two key contributions. First, it introduces the "Hint"mechanism for the first time, i.e., leveraging the prior knowledge of target locations to highlight critical local features. Second, it improves the mainstream UNet-based architecture to preserve target pixels even after downsampling. HintU can shift the focus of various networks (e.g., vanilla UNet, UNet++, UIUNet, MiM+, and HCFNet) from the irrelevant background pixels to a more restricted area from the beginning. Experimental results on three datasets NUDT-SIRST, SIRSTv2, and IRSTD1K demonstrate that HintU enhances the performance of existing methods with only an additional 1.88-ms cost (on RTX Titan). Additionally, the explicit constraints of HintU enhance the generalization ability of UNet-based methods. Code is available at https://github.com/Wuzhou-Quan/HintU.
AB - Infrared small target detection (ISTD) is a challenging task due to the low contrast and small size of the targets, which are often affected by complex backgrounds. UNet and its variants, known for their encoder-decoder structures, are widely used in such tasks since they can capture both local and global features. However, a significant drawback of UNet-based networks is the irreversible loss of crucial local features during downsampling, leading to missed detections and false positives, especially for small targets. Compared to other architectures like feature pyramid networks, UNet provides a more symmetric and efficient structure, allowing it to handle dense pixel-wise predictions effectively. However, standard UNet models still struggle to fully retain small target details, motivating the need for further improvements. To address this issue, we propose HintU, a novel network to recover the local features lost by various UNet-based methods for effective ISTD. HintU has two key contributions. First, it introduces the "Hint"mechanism for the first time, i.e., leveraging the prior knowledge of target locations to highlight critical local features. Second, it improves the mainstream UNet-based architecture to preserve target pixels even after downsampling. HintU can shift the focus of various networks (e.g., vanilla UNet, UNet++, UIUNet, MiM+, and HCFNet) from the irrelevant background pixels to a more restricted area from the beginning. Experimental results on three datasets NUDT-SIRST, SIRSTv2, and IRSTD1K demonstrate that HintU enhances the performance of existing methods with only an additional 1.88-ms cost (on RTX Titan). Additionally, the explicit constraints of HintU enhance the generalization ability of UNet-based methods. Code is available at https://github.com/Wuzhou-Quan/HintU.
KW - HintU
KW - UNet
KW - infrared small target detection (ISTD)
UR - http://www.scopus.com/inward/record.url?scp=105001068859&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3504594
DO - 10.1109/TGRS.2024.3504594
M3 - Article
AN - SCOPUS:105001068859
SN - 0196-2892
VL - 63
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5000115
ER -