View-Consistency Network for Weakly Supervised Oriented Object Detection in Remote Sensing Images

Fang Tingting; Liu Bin; Chen Chunhui; Li Xiangyun

doi:10.11834/jrs.20243138

Views : 0 下载量: 144 CSCD: 0 更多指标

R-PDF
PDF
Export
Share
Collection
Album

View-Consistency Network for Weakly Supervised Oriented Object Detection in Remote Sensing Images
Pages: 1-17(2024)
Published Online： 08 March 2024 ，
DOI： 10.11834/jrs.20243138

扫描看全文

方婷婷，刘斌，陈春晖，厉香蕴.XXXX.视图一致性网络下的弱监督遥感影像旋转目标检测.遥感学报，XX（XX）： 1-17

Fang Tingting，Liu Bin，Chen Chunhui，Li Xiangyun. XXXX. View-Consistency Network for Weakly Supervised Oriented Object Detection in Remote Sensing Images. National Remote Sensing Bulletin， XX（XX）：1-17
方婷婷，刘斌，陈春晖，厉香蕴.XXXX.视图一致性网络下的弱监督遥感影像旋转目标检测.遥感学报，XX（XX）： 1-17 DOI： 10.11834/jrs.20243138.

Fang Tingting，Liu Bin，Chen Chunhui，Li Xiangyun. XXXX. View-Consistency Network for Weakly Supervised Oriented Object Detection in Remote Sensing Images. National Remote Sensing Bulletin， XX（XX）：1-17 DOI： 10.11834/jrs.20243138.

摘要

为了减轻旋转目标检测的标注负担、提高遥感旋转目标检测数据集的可用性和丰富性，本文提出了一种仅使用图像级别标注的旋转目标检测模型。该模型的核心在于利用三个一致性约束：图像级别标注的一致性、旋转检测框位置的一致性和聚类中心分布的一致性，来设计目标检测网络。首先，本文引入了遥感场景下的弱监督旋转目标检测范式，通过逐阶段优化预测结果，利用图像级别标注的一致性来训练模型。其次，为了提高旋转角度的预测准确性，模型通过利用不同旋转视图下的旋转检测框的空间位置一致性约束来调整角度预测。最后，为了保持不同旋转视图下聚类中心节点的分布一致性，设计了匈牙利损失来度量聚类中心的分布一致性。通过这些一致性约束，模型可以更好地利用不同旋转视图下的旋转等变性，提高检测性能。为了验证该模型的有效性，本文在两个主流的遥感目标检测数据集DIOR和DOTA-v1.0上进行了实验。实验结果显示，模型在DIOR数据集上取得37.7% mAP，相较于水平弱监督最先进模型提升了9.4% mAP；在DOTA数据集上取得28.1% mAP，在多个检测类别上优于难度更小的、基于水平框标注的旋转弱监督的最先进模型。这表明该模型在解决标注困难和提高遥感目标检测性能方面具有良好的潜力。未来，研究者可以进一步拓展这种基于图像级别标注的旋转目标检测模型，结合更先进的技术和不同数据源，探索其在遥感、地理空间分析和其他相关领域的更广泛应用。

Abstract

Objective The objective of this study is to propose a novel oriented object detection model that can effectively detect objects in remote sensing images

while alleviating the challenges of labor-intensive and time-consuming annotation processes. Specifically

the aim is to develop an efficient and effective approach that only requires image-level annotations

which can improve the availability and diversity of remote sensing rotation object detection datasets. The proposed model is designed to overcome the limitations of traditional annotation methods that rely on bounding box annotations

which can be subjective

inconsistent

and time-consuming to create. By leveraging image-level annotations

the proposed model can greatly reduce the annotation effort

accelerate the annotation process

and enhance the scalability and applicability of remote sensing object detection in various scenarios and domains.Method The proposed method introduces a novel weakly supervised oriented object detection paradigm for remote sensing scenes. The model is trained in a progressive manner

starting with coarse image-level annotations and gradually refining the detection results. This approach allows the model to learn from limited annotations and adapt to the complexities of remote sensing data

which often exhibit large scale

diverse appearance

and significant rotation variations. The model incorporates consistency constraints of image-level annotation

the oriented bounding box position and the cluster center distribution in different rotation views

to enhance the accuracy and robustness of the detection performance. The oriented bounding box position consistency ensures that the predicted rotation angles of the bounding boxes are consistent across different views of the same object

while the distribution consistency of clustering centers is measured using the Hungarian loss

which ensures that the predicted object centroids are consistent across different views. These consistency constraints are designed to improve the accuracy of the model in handling rotation variations

which is a common challenge in remote sensing object detection.Result The proposed model is evaluated on two mainstream remote sensing object detection datasets

DIOR and DOTA-v1.0. The experimental results demonstrate that the performance of the proposed model is significantly improved compared to state-of-the-art weakly supervised remote sensing object detection models. Results show that our performance is significantly improved compared to the state-of-the-art object detectors with less strict weakly-supervised settings in remote sensing images

which also highlight the effectiveness and potential of the proposed image-level annotation-based oriented object detection model in addressing the challenges of remote sensing object detection. Furthermore

the model's ability to generalize well to different datasets showcases its robustness and versatility.Conclusion In conclusion

the proposed image-level annotation-based oriented object detection model is an innovative approach that addresses the challenges of labor-intensive and time-consuming annotation processes in remote sensing object detection. By leveraging image-level annotations and incorporating consistency constraints

the proposed model achieves improved detection performance while reducing the annotation effort and improving annotation efficiency. The experimental results on DIOR and DOTA-v1.0 datasets demonstrate the superior performance of the proposed model compared to state-of-the-art models

highlighting its potential for practical applications in remote sensing object detection and related fields. Future research can further explore the potential of image-level annotation-based oriented object detection models by incorporating more advanced techniques

exploring different data sources

and investigating real-world applications in remote sensing

geospatial analysis

and other related fields.

关键词

旋转目标检测目标检测弱监督深度学习旋转一致性图像级标注匈牙利损失DOTA数据集

Keywords

Oriented Object DetectionObject DetectionWeakly SupervisedDeep LearningRotation ConsistencyImage-level LabelHungarian LossDOTA Dataset

references

Bilen H, Vedaldi A. 2016. Weakly supervised deep detection networks. Proceedings of the IEEE conference on computer vision and pattern recognition, 2846-2854. [DOI 10.1109/CVPR.2016.311http://dx.doi.org/10.1109/CVPR.2016.311]

Cheng G, Wang J, Li K, Xie X, Lang C, Yao Y, Han J. 2022. Anchor-free oriented proposal generator for object detection. IEEE Transactions on Geoscience and Remote Sensing, 60: 1-11. [DOI 10.1109/TGRS.2022.3183022http://dx.doi.org/10.1109/TGRS.2022.3183022]

Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. 2020. End-to-end object detection with transformers. In Computer Vision–ECCV, 1(16): 213-229. [DOI 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13]

Ding J, Xue N, Long Y, Xia GS, Lu Q. 2019. Learning roi transformer for oriented object detection in aerial images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2849-2858. [DOI 10.1109/CVPR.2019.00296http://dx.doi.org/10.1109/CVPR.2019.00296]

Feng X, Yao X, Cheng G, Han J. 2022. Weakly supervised rotation-invariant aerial object detection network. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14146-14155. [DOI 10.1109/CVPR52688.2022.01375http://dx.doi.org/10.1109/CVPR52688.2022.01375]

Feng X, Han J, Yao X, Cheng G. 2020. Progressive contextual instance refinement for weakly supervised object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 58(11):8002-12. [DOI 10.1109/TGRS.2020.2985989http://dx.doi.org/10.1109/TGRS.2020.2985989]

Feng X, Han J, Yao X, Cheng G. 2020. TCANet: Triple context-aware network for weakly supervised object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 59(8):6946-55. [DOI 10.1109/TGRS.2020.3030990http://dx.doi.org/10.1109/TGRS.2020.3030990]

Gao Y, Liu B, Guo N, Ye X, Wan F, You H, Fan D. 2019. C-midn: Coupled multiple instance detection network with segmentation guidance for weakly supervised object detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, 9834-9843. [DOI 10.1109/ICCV.2019.00993http://dx.doi.org/10.1109/ICCV.2019.00993]

Girshick R, Donahue J, Darrell T, Malik J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, 580-587. [DOI 10.1109/CVPR.2014.81http://dx.doi.org/10.1109/CVPR.2014.81]

Girshick R. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, 1440-1448. [DOI 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169]

Han J, Ding J, Xue N, Xia G S. 2021. Redet: A rotation-equivariant detector for aerial object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 2786-2795. [DOI 10.1109/CVPR46437.2021.00281http://dx.doi.org/10.1109/CVPR46437.2021.00281]

Huang Z, Zou Y, Kumar BV, Huang D. 2020. Comprehensive attention self-distillation for weakly-supervised object detection. Advances in neural information processing systems, 33:16797-807. [DOI 10.48550/arXiv.2010.12023http://dx.doi.org/10.48550/arXiv.2010.12023]

He K, Gkioxari G, Dollár P, Girshick R. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, 2961-2969. [DOI 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]

Li W, Chen Y, Hu K, Zhu J. Oriented reppoints for aerial object detection. 2022. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1829-1838. [DOI 10.1109/CVPR52688.2022.00187http://dx.doi.org/10.1109/CVPR52688.2022.00187]

Li X, Kan M, Shan S, Chen X. Weakly supervised object detection with segmentation collaboration. 2019. In Proceedings of the IEEE/CVF international conference on computer vision, 9735-9744. [DOI 10.1109/ICCV.2019.00983http://dx.doi.org/10.1109/ICCV.2019.00983]

Li W, Liu W, Zhu J, Cui M, Hua XS, Zhang L. 2022. Box-supervised instance segmentation with level set evolution. In Computer Vision–ECCV, 1-18. [DOI 10.1007/978-3-031-19818-2_1http://dx.doi.org/10.1007/978-3-031-19818-2_1]

Li K, Wan G, Cheng G, Meng L, Han J. 2020. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS journal of photogrammetry and remote sensing, 159:296-307. [DOI 10.1016/j.isprsjprs.2019.11.023http://dx.doi.org/10.1016/j.isprsjprs.2019.11.023]

Ren S, He K, Girshick R, Sun J. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. [DOI 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]

Tang P, Wang X, Bai X, Liu W. Multiple instance detection network with online instance classifier refinement. 2017. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2843-2851. [DOI 10.1109/CVPR.2017.326http://dx.doi.org/10.1109/CVPR.2017.326]

Tang P, Wang X, Bai S, Shen W, Bai X, Liu W, Yuille A. 2018. Pcl: Proposal cluster learning for weakly supervised object detection. IEEE transactions on pattern analysis and machine intelligence, 42(1):176-91. [DOI 10.1109/TPAMI.2018.2876304http://dx.doi.org/10.1109/TPAMI.2018.2876304]

Tian Z, Shen C, Wang X, Chen H. 2021. Boxinst: High-performance instance segmentation with box annotations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5443-5452. [DOI 10.1109/CVPR46437.2021.00540http://dx.doi.org/10.1109/CVPR46437.2021.00540]

Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW. 2013. Selective search for object recognition. International journal of computer vision, 104:154-71. [DOI 10.1007/s11263-013-0620-5http://dx.doi.org/10.1007/s11263-013-0620-5]

Wei Y, Shen Z, Cheng B, Shi H, Xiong J, Feng J, Huang T. 2018. Ts2c: Tight box mining with surrounding segmentation context for weakly supervised object detection. In Proceedings of the European conference on computer vision (ECCV), 434-450. [DOI 10.1007/978-3-030-01252-6_27http://dx.doi.org/10.1007/978-3-030-01252-6_27]

Wang B, Zhao Y, Li X. 2021. Multiple instance graph learning for weakly supervised remote sensing object detection. IEEE Transactions on Geoscience and Remote Sensing, 60:1-2. [DOI 10.1109/TGRS.2021.3123231http://dx.doi.org/10.1109/TGRS.2021.3123231]

Xie X, Cheng G, Wang J, Yao X, Han J. 2021. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3520-3529. [DOI 10.1109/ICCV48922.2021.00350http://dx.doi.org/10.1109/ICCV48922.2021.00350]

Xia GS, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L. 2018. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3974-3983. [DOI 10.1109/CVPR.2018.00418http://dx.doi.org/10.1109/CVPR.2018.00418]

Yang X, Yan J, Feng Z, He T. 2021. R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI conference on artificial intelligence, 3163-3171. [DOI 10.1109/CVPR46437.2021.00281http://dx.doi.org/10.1109/CVPR46437.2021.00281]

Yang X, Yang J, Ming Q, Wang W, Tian Q, Yan J. 2021. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Advances in Neural Information Processing Systems, 34:18381-94.

Yao X, Feng X, Han J, Cheng G, Guo L. 2020. Automatic weakly supervised object detection from high spatial resolution remote sensing images via dynamic curriculum learning. IEEE Transactions on Geoscience and Remote Sensing, 59(1):675-85. [DOI 10.1109/TGRS.2020.2991407http://dx.doi.org/10.1109/TGRS.2020.2991407]

Yang X, Zhang G, Li W, Wang X, Zhou Y, Yan J. 2022. H2RBox: Horizonal Box Annotation is All You Need for Oriented Object Detection. arXiv preprint arXiv, 2210.06742. [DOI 10.48550/arXiv.2210.06742http://dx.doi.org/10.48550/arXiv.2210.06742]

Yang X, Hou L, Zhou Y, Wang W, Yan J. 2021. Dense label encoding for boundary discontinuity free rotation detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 15819-15829. [DOI 10.1109/CVPR46437.2021.01556http://dx.doi.org/10.1109/CVPR46437.2021.01556]

Alert me when the article has been cited

提交

SAR ship detection through generative knowledge transfer

A comprehensive review of optical remote-sensing image object detection datasets

Application of an improved CenterNet in remote sensing images object detection

Refined multi-scale feature-oriented object detection of the remote sensing images