View-Consistency Network for Weakly Supervised Oriented Object Detection in Remote Sensing Images
- Pages: 1-17(2024)
Published Online: 08 March 2024
DOI: 10.11834/jrs.20243138
扫 描 看 全 文
浏览全部资源
扫码关注微信
Published Online: 08 March 2024 ,
扫 描 看 全 文
方婷婷,刘斌,陈春晖,厉香蕴.XXXX.视图一致性网络下的弱监督遥感影像 旋转目标检测.遥感学报,XX(XX): 1-17
Fang Tingting,Liu Bin,Chen Chunhui,Li Xiangyun. XXXX. View-Consistency Network for Weakly Supervised Oriented Object Detection in Remote Sensing Images. National Remote Sensing Bulletin, XX(XX):1-17
为了减轻旋转目标检测的标注负担、提高遥感旋转目标检测数据集的可用性和丰富性,本文提出了一种仅使用图像级别标注的旋转目标检测模型。该模型的核心在于利用三个一致性约束:图像级别标注的一致性、旋转检测框位置的一致性和聚类中心分布的一致性,来设计目标检测网络。首先,本文引入了遥感场景下的弱监督旋转目标检测范式,通过逐阶段优化预测结果,利用图像级别标注的一致性来训练模型。其次,为了提高旋转角度的预测准确性,模型通过利用不同旋转视图下的旋转检测框的空间位置一致性约束来调整角度预测。最后,为了保持不同旋转视图下聚类中心节点的分布一致性,设计了匈牙利损失来度量聚类中心的分布一致性。通过这些一致性约束,模型可以更好地利用不同旋转视图下的旋转等变性,提高检测性能。为了验证该模型的有效性,本文在两个主流的遥感目标检测数据集DIOR和DOTA-v1.0上进行了实验。实验结果显示,模型在DIOR数据集上取得37.7% mAP,相较于水平弱监督最先进模型提升了9.4% mAP;在DOTA数据集上取得28.1% mAP,在多个检测类别上优于难度更小的、基于水平框标注的旋转弱监督的最先进模型。这表明该模型在解决标注困难和提高遥感目标检测性能方面具有良好的潜力。未来,研究者可以进一步拓展这种基于图像级别标注的旋转目标检测模型,结合更先进的技术和不同数据源,探索其在遥感、地理空间分析和其他相关领域的更广泛应用。
Objective The objective of this study is to propose a novel oriented object detection model that can effectively detect objects in remote sensing images
while alleviating the challenges of labor-intensive and time-consuming annotation processes. Specifically
the aim is to develop an efficient and effective approach that only requires image-level annotations
which can improve the availability and diversity of remote sensing rotation object detection datasets. The proposed model is designed to overcome the limitations of traditional annotation methods that rely on bounding box annotations
which can be subjective
inconsistent
and time-consuming to create. By leveraging image-level annotations
the proposed model can greatly reduce the annotation effort
accelerate the annotation process
and enhance the scalability and applicability of remote sensing object detection in various scenarios and domains.Method The proposed method introduces a novel weakly supervised oriented object detection paradigm for remote sensing scenes. The model is trained in a progressive manner
starting with coarse image-level annotations and gradually refining the detection results. This approach allows the model to learn from limited annotations and adapt to the complexities of remote sensing data
which often exhibit large scale
diverse appearance
and significant rotation variations. The model incorporates consistency constraints of image-level annotation
the oriented bounding box position and the cluster center distribution in different rotation views
to enhance the accuracy and robustness of the detection performance. The oriented bounding box position consistency ensures that the predicted rotation angles of the bounding boxes are consistent across different views of the same object
while the distribution consistency of clustering centers is measured using the Hungarian loss
which ensures that the predicted object centroids are consistent across different views. These consistency constraints are designed to improve the accuracy of the model in handling rotation variations
which is a common challenge in remote sensing object detection.Result The proposed model is evaluated on two mainstream remote sensing object detection datasets
DIOR and DOTA-v1.0. The experimental results demonstrate that the performance of the proposed model is significantly improved compared to state-of-the-art weakly supervised remote sensing object detection models. Results show that our performance is significantly improved compared to the state-of-the-art object detectors with less strict weakly-supervised settings in remote sensing images
which also highlight the effectiveness and potential of the proposed image-level annotation-based oriented object detection model in addressing the challenges of remote sensing object detection. Furthermore
the model's ability to generalize well to different datasets showcases its robustness and versatility.Conclusion In conclusion
the proposed image-level annotation-based oriented object detection model is an innovative approach that addresses the challenges of labor-intensive and time-consuming annotation processes in remote sensing object detection. By leveraging image-level annotations and incorporating consistency constraints
the proposed model achieves improved detection performance while reducing the annotation effort and improving annotation efficiency. The experimental results on DIOR and DOTA-v1.0 datasets demonstrate the superior performance of the proposed model compared to state-of-the-art models
highlighting its potential for practical applications in remote sensing object detection and related fields. Future research can further explore the potential of image-level annotation-based oriented object detection models by incorporating more advanced techniques
exploring different data sources
and investigating real-world applications in remote sensing
geospatial analysis
and other related fields.
旋转目标检测目标检测弱监督深度学习旋转一致性图像级标注匈牙利损失DOTA数据集
Oriented Object DetectionObject DetectionWeakly SupervisedDeep LearningRotation ConsistencyImage-level LabelHungarian LossDOTA Dataset
Bilen H, Vedaldi A. 2016. Weakly supervised deep detection networks. Proceedings of the IEEE conference on computer vision and pattern recognition, 2846-2854. [DOI 10.1109/CVPR.2016.311http://dx.doi.org/10.1109/CVPR.2016.311]
Cheng G, Wang J, Li K, Xie X, Lang C, Yao Y, Han J. 2022. Anchor-free oriented proposal generator for object detection. IEEE Transactions on Geoscience and Remote Sensing, 60: 1-11. [DOI 10.1109/TGRS.2022.3183022http://dx.doi.org/10.1109/TGRS.2022.3183022]
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. 2020. End-to-end object detection with transformers. In Computer Vision–ECCV, 1(16): 213-229. [DOI 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13]
Ding J, Xue N, Long Y, Xia GS, Lu Q. 2019. Learning roi transformer for oriented object detection in aerial images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2849-2858. [DOI 10.1109/CVPR.2019.00296http://dx.doi.org/10.1109/CVPR.2019.00296]
Feng X, Yao X, Cheng G, Han J. 2022. Weakly supervised rotation-invariant aerial object detection network. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14146-14155. [DOI 10.1109/CVPR52688.2022.01375http://dx.doi.org/10.1109/CVPR52688.2022.01375]
Feng X, Han J, Yao X, Cheng G. 2020. Progressive contextual instance refinement for weakly supervised object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 58(11):8002-12. [DOI 10.1109/TGRS.2020.2985989http://dx.doi.org/10.1109/TGRS.2020.2985989]
Feng X, Han J, Yao X, Cheng G. 2020. TCANet: Triple context-aware network for weakly supervised object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 59(8):6946-55. [DOI 10.1109/TGRS.2020.3030990http://dx.doi.org/10.1109/TGRS.2020.3030990]
Gao Y, Liu B, Guo N, Ye X, Wan F, You H, Fan D. 2019. C-midn: Coupled multiple instance detection network with segmentation guidance for weakly supervised object detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, 9834-9843. [DOI 10.1109/ICCV.2019.00993http://dx.doi.org/10.1109/ICCV.2019.00993]
Girshick R, Donahue J, Darrell T, Malik J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, 580-587. [DOI 10.1109/CVPR.2014.81http://dx.doi.org/10.1109/CVPR.2014.81]
Girshick R. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, 1440-1448. [DOI 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169]
Han J, Ding J, Xue N, Xia G S. 2021. Redet: A rotation-equivariant detector for aerial object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 2786-2795. [DOI 10.1109/CVPR46437.2021.00281http://dx.doi.org/10.1109/CVPR46437.2021.00281]
Huang Z, Zou Y, Kumar BV, Huang D. 2020. Comprehensive attention self-distillation for weakly-supervised object detection. Advances in neural information processing systems, 33:16797-807. [DOI 10.48550/arXiv.2010.12023http://dx.doi.org/10.48550/arXiv.2010.12023]
He K, Gkioxari G, Dollár P, Girshick R. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, 2961-2969. [DOI 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]
Li W, Chen Y, Hu K, Zhu J. Oriented reppoints for aerial object detection. 2022. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1829-1838. [DOI 10.1109/CVPR52688.2022.00187http://dx.doi.org/10.1109/CVPR52688.2022.00187]
Li X, Kan M, Shan S, Chen X. Weakly supervised object detection with segmentation collaboration. 2019. In Proceedings of the IEEE/CVF international conference on computer vision, 9735-9744. [DOI 10.1109/ICCV.2019.00983http://dx.doi.org/10.1109/ICCV.2019.00983]
Li W, Liu W, Zhu J, Cui M, Hua XS, Zhang L. 2022. Box-supervised instance segmentation with level set evolution. In Computer Vision–ECCV, 1-18. [DOI 10.1007/978-3-031-19818-2_1http://dx.doi.org/10.1007/978-3-031-19818-2_1]
Li K, Wan G, Cheng G, Meng L, Han J. 2020. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS journal of photogrammetry and remote sensing, 159:296-307. [DOI 10.1016/j.isprsjprs.2019.11.023http://dx.doi.org/10.1016/j.isprsjprs.2019.11.023]
Ren S, He K, Girshick R, Sun J. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. [DOI 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Tang P, Wang X, Bai X, Liu W. Multiple instance detection network with online instance classifier refinement. 2017. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2843-2851. [DOI 10.1109/CVPR.2017.326http://dx.doi.org/10.1109/CVPR.2017.326]
Tang P, Wang X, Bai S, Shen W, Bai X, Liu W, Yuille A. 2018. Pcl: Proposal cluster learning for weakly supervised object detection. IEEE transactions on pattern analysis and machine intelligence, 42(1):176-91. [DOI 10.1109/TPAMI.2018.2876304http://dx.doi.org/10.1109/TPAMI.2018.2876304]
Tian Z, Shen C, Wang X, Chen H. 2021. Boxinst: High-performance instance segmentation with box annotations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5443-5452. [DOI 10.1109/CVPR46437.2021.00540http://dx.doi.org/10.1109/CVPR46437.2021.00540]
Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW. 2013. Selective search for object recognition. International journal of computer vision, 104:154-71. [DOI 10.1007/s11263-013-0620-5http://dx.doi.org/10.1007/s11263-013-0620-5]
Wei Y, Shen Z, Cheng B, Shi H, Xiong J, Feng J, Huang T. 2018. Ts2c: Tight box mining with surrounding segmentation context for weakly supervised object detection. In Proceedings of the European conference on computer vision (ECCV), 434-450. [DOI 10.1007/978-3-030-01252-6_27http://dx.doi.org/10.1007/978-3-030-01252-6_27]
Wang B, Zhao Y, Li X. 2021. Multiple instance graph learning for weakly supervised remote sensing object detection. IEEE Transactions on Geoscience and Remote Sensing, 60:1-2. [DOI 10.1109/TGRS.2021.3123231http://dx.doi.org/10.1109/TGRS.2021.3123231]
Xie X, Cheng G, Wang J, Yao X, Han J. 2021. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3520-3529. [DOI 10.1109/ICCV48922.2021.00350http://dx.doi.org/10.1109/ICCV48922.2021.00350]
Xia GS, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L. 2018. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3974-3983. [DOI 10.1109/CVPR.2018.00418http://dx.doi.org/10.1109/CVPR.2018.00418]
Yang X, Yan J, Feng Z, He T. 2021. R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI conference on artificial intelligence, 3163-3171. [DOI 10.1109/CVPR46437.2021.00281http://dx.doi.org/10.1109/CVPR46437.2021.00281]
Yang X, Yang J, Ming Q, Wang W, Tian Q, Yan J. 2021. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Advances in Neural Information Processing Systems, 34:18381-94.
Yao X, Feng X, Han J, Cheng G, Guo L. 2020. Automatic weakly supervised object detection from high spatial resolution remote sensing images via dynamic curriculum learning. IEEE Transactions on Geoscience and Remote Sensing, 59(1):675-85. [DOI 10.1109/TGRS.2020.2991407http://dx.doi.org/10.1109/TGRS.2020.2991407]
Yang X, Zhang G, Li W, Wang X, Zhou Y, Yan J. 2022. H2RBox: Horizonal Box Annotation is All You Need for Oriented Object Detection. arXiv preprint arXiv, 2210.06742. [DOI 10.48550/arXiv.2210.06742http://dx.doi.org/10.48550/arXiv.2210.06742]
Yang X, Hou L, Zhou Y, Wang W, Yan J. 2021. Dense label encoding for boundary discontinuity free rotation detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 15819-15829. [DOI 10.1109/CVPR46437.2021.01556http://dx.doi.org/10.1109/CVPR46437.2021.01556]
相关文章
相关作者
相关机构