Feature reassembly and self-attention for oriented object detection in remote sensing images
- Vol. 27, Issue 12, Pages: 2716-2725(2023)
Published: 07 December 2023
DOI: 10.11834/jrs.20233456
扫 描 看 全 文
浏览全部资源
扫码关注微信
Published: 07 December 2023 ,
扫 描 看 全 文
闵令通,范子满,谢星星,吕勤毅.2023.特征重组和自注意力的遥感图像有向目标检测.遥感学报,27(12): 2716-2725
Min L T,Fan Z M,Xie X X and Lyu Q Y. 2023. Feature reassembly and self-attention for oriented object detection in remote sensing images. National Remote Sensing Bulletin, 27(12):2716-2725
遥感图像有向目标检测是一项非常有挑战性的任务,受到了广泛的关注。随着深度学习的迅速发展,基于卷积神经网络(CNN)和自注意力网络(Transformer)的神经网络在有向目标检测方面取得了显著成果。然而,对于遥感图像中的有向目标,仍然存在对边界信息和显著特征信息的关注不足的问题。其中,不同方向目标的边界信息有限且难以提取,而显著特征的全局依赖关系相对稀疏。因此,本文提出了基于特征重组和自注意力的遥感图像有向目标检测方法。该方法主要包括空间通道重组的回归分支和自注意力分类分支。其中,回归分支通过在通道维度中重组空间信息,更加关注边界敏感信息,以实现对定位框的精确定位。分类分支依据带有位置信息的自注意力捕获目标根本判别性的特征,并增强特征的全局依赖性,从而实现准确分类。通过广泛的实验验证,证明了所提出模型的有效性和鲁棒性。在公开数据集DOTA、HRSC2016和SODA-A上表现优秀。
Oriented object detection in remote sensing images is an exceptionally challenging task that has elicited widespread attention. With the rapid advancement of deep learning
neural networks based on convolutional neural networks and self-attention networks (e.g.
Transformers) have achieved remarkable progress in oriented object detection. However
the focus on boundary and salient feature information in oriented objects in remote sensing images is lacking. Specifically
extracting boundary information for objects with varying orientations is difficult
and the global dependency of salient features is sparse. To address these issues
we propose a method of small-object detection in remote sensing images on the basis of feature reassembly and self-attention. This method consists of a regression branch that incorporates spatial channel reassembly and a self-attention classification branch. The regression branch reassembles spatial information along the channel dimension and emphasizes boundary-sensitive information to achieve accurate localization of bounding boxes. The classification branch leverages self-attention with positional information to capture fundamentally discriminative object features
thus enhancing global feature dependencies for precise classification. Extensive experiments demonstrate the effectiveness and robustness of the proposed model and showcase its excellent performance on publicly available datasets
such as DOTA
HRSC2016
and SODA-A.
遥感图像有向目标检测检测头特征重组自注意力
remote sensing imagesmall object detectiondetection headfeature reorganizationtransformer
Azimi S M, Vig E, Bahmanyar R, Körner M and Reinartz P. 2018. Towards multi-class object detection in unconstrained remote sensing imagery//Proceedings of the 14th Asian Conference on Computer Vision. Perth: Springer: 150-165 [DOI: 10.1007/978-3-030-20893-6_10http://dx.doi.org/10.1007/978-3-030-20893-6_10]
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A and Zagoruyko S. 2020. End-to-end object detection with transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow: Springer: 213-229 [DOI: 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13]
Chen Z M, Chen K A, Lin W Y, See J, Yu H, Ke Y and Yang Y. 2020. PIoU loss: towards accurate oriented object detection in complex environments//Proceedings of the 16th European Conference on Computer Vision. Glasgow: Springer: 195-211 [DOI: 10.1007/978-3-030-58558-7_12http://dx.doi.org/10.1007/978-3-030-58558-7_12]
Cheng G, Li Q Y, Wang G X, Xie X X, Min L T and Han J W. 2023a. SFRNet: fine-grained oriented object recognition via separate feature refinement. IEEE Transactions on Geoscience and Remote Sensing, 61: 5610510 [DOI: 10.1109/TGRS.2023.3277626http://dx.doi.org/10.1109/TGRS.2023.3277626]
Cheng G, Yao Y Q, Li S Y, Li K, Xie X X, Wang J B, Yao X W and Han J W. 2022. Dual-aligned oriented detector. IEEE Transactions on Geoscience and Remote Sensing, 60: 5618111 [DOI: 10.1109/TGRS.2022.3149780http://dx.doi.org/10.1109/TGRS.2022.3149780]
Cheng G, Yuan X, Yao X W, Yan K B, Zeng Q H, Xie X X and Han J W. 2023b. Towards large-scale small object detection: survey and benchmarks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11): 13467-13488 [DOI: 10.1109/TPAMI.2023.3290594http://dx.doi.org/10.1109/TPAMI.2023.3290594]
Ding J, Xue N, Long Y, Xia G S and Lu Q K. 2019. Learning RoI transformer for oriented object detection in aerial images//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE [DOI: 10.1109/CVPR.2019.00296http://dx.doi.org/10.1109/CVPR.2019.00296]
Everingham M, Van Gool L, Williams C K I, Winn J and Zisserman A. 2010. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2): 303-338 [DOI: 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4]
Fu K, Chang Z H, Zhang Y and Sun X. 2021. Point-based estimator for arbitrary-oriented object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 59(5): 4370-4387 [DOI: 10.1109/TGRS.2020.3020165http://dx.doi.org/10.1109/TGRS.2020.3020165]
Han J M, Ding J, Li J and Xia G S. 2022. Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing, 60: 5602511 [DOI: 10.1109/TGRS.2021.3062048http://dx.doi.org/10.1109/TGRS.2021.3062048]
He K M, Gkioxari G, Dollár P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE [DOI: 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]
Jiang Y Q, Tan Z Y, Wang J Y, Sun X Y, Lin M and Li H. 2022. GiraffeDet: a heavy-neck paradigm for object detection. arXiv preprint arXiv: 2202.04256
Jiang Y Y, Zhu X Y, Wang X B, Yang S L, Li W, Wang H, Fu P and Luo Z B. 2017. R2CNN: rotational region CNN for orientation robust scene text detection. arXiv preprint arXiv: 1706.09579
Li C Z, Xu C Y, Cui Z, Wang D, Zhang T and Yang J. 2019. Feature-attentioned object detection in remote sensing imagery//Proceedings of 2019 IEEE International Conference on Image Processing. Taipei, China: IEEE [DOI: 10.1109/ICIP.2019.8803521http://dx.doi.org/10.1109/ICIP.2019.8803521]
Li H G, Yu R N and Ding W R. 2021. Research development of small object traching based on deep learning. Acta Aeronauticaet Astronautica Sinica, 42(7): 024691
李红光, 于若男, 丁文锐. 2021. 基于深度学习的小目标检测研究进展. 航空学报, 42(7): 024691 [DOI: 10.7527/S1000-6893.2020.24691http://dx.doi.org/10.7527/S1000-6893.2020.24691]
Li W T, Chen Y J, Hu K X and Zhu J K. 2022. Oriented RepPoints for aerial object detection//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE [DOI: 10.1109/CVPR52688.2022.00187http://dx.doi.org/10.1109/CVPR52688.2022.00187]
Lin T Y, Goyal P, Girshick R, He K M and Dollár P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P and Zitnick C L. 2014. Microsoft COCO: common objects in context//Proceedings of the 13th European Conference on Computer Vision. Zurich: Springer: 740-755 [DOI: 10.1007/978-3-319-10602-1_48http://dx.doi.org/10.1007/978-3-319-10602-1_48]
Liu Z K, Wang H Z, Weng L B and Yang Y P. 2016. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geoscience and Remote Sensing Letters, 13(8): 1074-1078 [DOI: 10.1109/LGRS.2016.2565705http://dx.doi.org/10.1109/LGRS.2016.2565705]
Ma J Q, Shao W Y, Ye H, Wang L, Wang H, Zheng Y B and Xue X Y. 2018. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11): 3111-3122 [DOI: 10.1109/TMM.2018.2818020http://dx.doi.org/10.1109/TMM.2018.2818020]
Min L T, Fan Z M, Lv Q Y, Reda M, Shen L H and Wang B L. 2023. YOLO-DCTI: small object detection in remote sensing base on contextual transformer enhancement. Remote Sensing, 15(16): 3970 [DOI: 10.3390/rs15163970http://dx.doi.org/10.3390/rs15163970]
Ming Q, Zhou Z Q, Miao L J, Zhang H W and Li L H. 2021. Dynamic anchor learning for arbitrary-oriented object detection//Proceedings of the 35th AAAI Conference on Artificial Intelligence. [s.l.]: AAAI Press [DOI: 10.1609/aaai.v35i3.16336http://dx.doi.org/10.1609/aaai.v35i3.16336]
Nie G T and Huang H. 2021. A survey of object detection in optical remote sensing images. Acta Automatica Sinica, 47(8): 1749-1768
聂光涛, 黄华. 2021. 光学遥感图像目标检测算法综述. 自动化学报, 47(8): 1749-1768 [DOI: 10.16383/j.aas.c200596http://dx.doi.org/10.16383/j.aas.c200596]
Nie G T and Huang H. 2023. Multi-oriented object detection in aerial images with double horizontal rectangles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4): 4932-4944 [DOI: 10.1109/TPAMI.2022.3191753http://dx.doi.org/10.1109/TPAMI.2022.3191753]
Pan X J, Ren Y Q, Sheng K K, Dong W M, Yuan H L, Guo X W, Ma C Y and Xu C S. 2020. Dynamic refinement network for oriented and densely packed object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE [DOI: 10.1109/CVPR42600.2020.01122http://dx.doi.org/10.1109/CVPR42600.2020.01122]
Qian W, Yang X, Peng S L, Yan J C and Guo Y. 2021. Learning modulated loss for rotated object detection//Proceedings of the 35th AAAI Conference on Artificial Intelligence. [s.l.]: AAAI Press [DOI: 10.1609/aaai.v35i3.16347http://dx.doi.org/10.1609/aaai.v35i3.16347]
Song G L, Liu Y and Wang X G. 2020. Revisiting the sibling head in object detector//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE [DOI: 10.1109/CVPR42600.2020.01158http://dx.doi.org/10.1109/CVPR42600.2020.01158]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones K, Gomez A N, Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc.: 6000-6010
Wang J W, Ding J, Guo H W, Cheng W S, Pan T and Yang W. 2019. Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sensing, 11(24): 2930 [DOI: 10.3390/rs11242930http://dx.doi.org/10.3390/rs11242930]
Wang J W, Yang W, Li H C, Zhang H J and Xia G S. 2021. Learning center probability map for detecting objects in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 59(5): 4307-4323 [DOI: 10.1109/TGRS.2020.3010051http://dx.doi.org/10.1109/TGRS.2020.3010051]
Xia G S, Bai X, Ding J, Zhu Z, Belongie S, Luo J B, Datcu M, Pelillo M and Zhang L P. 2018. DOTA: a large-scale dataset for object detection in aerial images//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE [DOI: 10.1109/CVPR.2018.00418http://dx.doi.org/10.1109/CVPR.2018.00418]
Xie X X, Cheng G, Wang J B, Yao X W and Han J W. 2021. Oriented R-CNN for object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE [DOI: 10.1109/ICCV48922.2021.00350http://dx.doi.org/10.1109/ICCV48922.2021.00350]
Xu Y C, Fu M T, Wang Q M, Wang Y K, Chen K, Xia G S and Bai X. 2021. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(4): 1452-1459 [DOI: 10.1109/TPAMI.2020.2974745http://dx.doi.org/10.1109/TPAMI.2020.2974745]
Yang X, Yan J C, Feng Z M and He T. 2021. R3Det: refined single-stage detector with feature refinement for rotating object//Proceedings of the AAAI 35th Conference on Artificial Intelligence. [s.l.]: AAAI Press [DOI: 10.1609/aaai.v35i4.16426http://dx.doi.org/10.1609/aaai.v35i4.16426]
Yang X, Yang J R, Yan J C, Zhang Y, Zhang T F, Guo Z, Sun X and Fu K. 2019. SCRDet: towards more robust detection for small, cluttered and rotated objects//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE [DOI: 10.1109/ICCV.2019.00832http://dx.doi.org/10.1109/ICCV.2019.00832]
Zhang G J, Lu S J and Zhang W. 2019. CAD-Net: a context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 57(12): 10015-10024 [DOI: 10.1109/TGRS.2019.2930982http://dx.doi.org/10.1109/TGRS.2019.2930982]
Zhang G J, Luo Z P, Tian Z C, Zhang J Y, Zhang X Q and Lu S J. 2023. Towards efficient use of multi-scale features in transformer-based object detectors//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE [DOI: 10.1109/CVPR52729.2023.00601http://dx.doi.org/10.1109/CVPR52729.2023.00601]
相关文章
相关作者
相关机构