TY - JOUR
T1 - Point Transformer-Based Salient Object Detection Network for 3-D Measurement Point Clouds
AU - Wei, Zeyong
AU - Chen, Baian
AU - Wang, Weiming
AU - Chen, Honghua
AU - Wei, Mingqiang
AU - Li, Jonathan
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - While salient object detection (SOD) on 2-D images has been extensively studied, there is very little SOD work on 3-D measurement surfaces. We propose an effective point transformer-based SOD network for 3-D measurement point clouds, termed PSOD-Net. PSOD-Net is an encoder-decoder network that takes full advantage of transformers to model the contextual information in both multiscale point- and scenewise manners. In the encoder, we develop a point context transformer (PCT) module to capture region contextual features at the point level; PCT contains two different transformers to excavate the relationship among points. In the decoder, we develop a scene context transformer (SCT) module to learn context representations at the scene level; SCT contains both upsampling-and-transformer (UT) blocks and multicontext aggregation (MCA) units to integrate the global semantic and multilevel features from the encoder into the global scene context. Experiments show clear improvements of PSOD-Net over its competitors and validate that PSOD-Net is more robust to challenging cases such as small objects, multiple objects, and objects with complex structures. Code is available at: https://github.com/ZeyongWei/PSOD-Net.
AB - While salient object detection (SOD) on 2-D images has been extensively studied, there is very little SOD work on 3-D measurement surfaces. We propose an effective point transformer-based SOD network for 3-D measurement point clouds, termed PSOD-Net. PSOD-Net is an encoder-decoder network that takes full advantage of transformers to model the contextual information in both multiscale point- and scenewise manners. In the encoder, we develop a point context transformer (PCT) module to capture region contextual features at the point level; PCT contains two different transformers to excavate the relationship among points. In the decoder, we develop a scene context transformer (SCT) module to learn context representations at the scene level; SCT contains both upsampling-and-transformer (UT) blocks and multicontext aggregation (MCA) units to integrate the global semantic and multilevel features from the encoder into the global scene context. Experiments show clear improvements of PSOD-Net over its competitors and validate that PSOD-Net is more robust to challenging cases such as small objects, multiple objects, and objects with complex structures. Code is available at: https://github.com/ZeyongWei/PSOD-Net.
KW - 3-D measurement point cloud
KW - 3-D salient object detection (SOD)
KW - PSOD-Net
KW - point transformer
UR - http://www.scopus.com/inward/record.url?scp=85182917108&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3355968
DO - 10.1109/TGRS.2024.3355968
M3 - Article
AN - SCOPUS:85182917108
SN - 0196-2892
VL - 62
SP - 1
EP - 11
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5701511
ER -