边缘感知强化的高分辨遥感影像语义分割方法
Edge-perception Enhanced Segmentation Method for High-resolution Remote Sensing Image
- 2023年 页码:1-13
网络出版日期: 2023-11-16
DOI: 10.11834/jrs.20233098
扫 描 看 全 文
浏览全部资源
扫码关注微信
网络出版日期: 2023-11-16 ,
扫 描 看 全 文
于纯妍,李东霖,宋梅萍,于浩洋,Chein-I Chang.XXXX.边缘感知强化的高分辨遥感影像语义分割方法.遥感学报,XX(XX): 1-13
YU Chunyan,LI Donglin,SONG Meiping,YU Haoyang,Chang Chein-I. XXXX. Edge-perception Enhanced Segmentation Method for High-resolution Remote Sensing Image. National Remote Sensing Bulletin, XX(XX):1-13
基于深度卷积神经网络(DCNN)的高分辨率遥感影像语义分割方法不断取得显著发展,但仍存在分割对象边缘特征提取与表达困难问题,导致受遮挡及小目标地物的边缘分割效果较差,从而影响语义分割方法整体精度。针对以上问题,本文提出了一种边缘感知强化的高分辨率遥感影像语义分割方法。首先,提出Transformer-DCNN协同机制提取遥感影像的全局自注意力特征和空间上下文信息,提炼更为精准的地物语义特征表达;然后,构建边缘和不确定点共同引导的边缘感知强化模块,从不确定点及实体边缘两个视角增强模型的边缘信息处理能力;最后,通过语义分割解码器有效利用包含边缘信息的特征编码,提高分割对象边缘预测的准确性和连续性。在Vaihingen和Potsdam数据集上的实验对比结果表明所提模型能有效处理复杂地物的边缘信息,提升分割方法的准确性,在常用的评估指标平均交并比(mIoU)和其他评估指标均有着不错的效果。与经典的UNet++分割网络相比,本文所提方法在Vaihingen 数据集的mIoU得分提高了4.57%,在Potsdam 数据集的mIoU得分提高了5.01%,平均F1得分和整体精度OA也有不同程度的提升。
Objective The semantic segmentation method for high-resolution remote sensing images based on deep convolutional neural network (DCNN) has made remarkable progress
whereas there are still problems in the extraction and expression of edge features of segmented objects. As a result
the edge segmentation effect of occluded and small target objects is unsatisfactory
which affects the overall accuracy of the semantic segmentation method. To solve the above problems
an edge-aware enhanced semantic segmentation method for high-resolution remote sensing images is proposed in this paper.Method First
we utilize the Transformer-DCNN collaborative feature extraction mechanism to extract the global self-attention features and spatial context information of remote sensing images
in this way
the proposed model makes full use of the advantages of the Transformer to extract global context information and DCNN to extract spatial local context information
respectively. Besides
the proposed model extracts more accurate ground object semantic features expression
and designs a simple but effective feature extraction fusion module to fuse the features extracted by DCNN and Transformer; Next
we construct an edge-aware enhancement module that is composed of an edge-enhanced decoder and an uncertain point-enhanced decoder
which enhances the edge information processing ability of the remote sensing image semantic segmentation model from the two perspectives including uncertain points view and entity edges view; Finally
the semantic segmentation decoder effectively employs the feature codes containing edge information to improve the accuracy and completeness of segmented object edge prediction
which guarantees that the semantic segmentation effect of remote sensing images is improved overall.Result The comparative experimental results were conducted on two public datasets
namely Potsdam and Vaihingen. In comparison to the classical Unet++ network
the method proposed in this paper demonstrates improvements of 4.57% and 5.01% in mIoU for the two datasets
respectively. Additionally
the average F1 score and overall accuracy also exhibit varying degrees of improvement. Furthermore
compared to the Transformer-based TransUNet model
our method also achieves superior results.Conclusion Enhancing the feature extraction of edge information in remote-sensing objects leads to significant improvements in both edge and overall semantic segmentation accuracy of high-resolution remote-sensing images. In this paper
the proposed edge perception enhancement module improves the ability to process edge information from two perspectives including uncertain point view and entity edge view
which effectively enhances the edge segmentation accuracy of complex terrain objects. The results in commonly used evaluation indicators demonstrate the effectiveness and robustness of our model.
深度学习遥感影像语义分割边缘感知Transformer特征提取编码器解码器
Deep LearningRemote Sensing ImageSemantic SegmentationEdge perceptionTransformerFeature ExtractionEncoderDecoder
Bai H W, Cheng J, Huang X, Liu S Y and Deng C J. 2022. HCANet: A Hierarchical Context Aggregation Network for Semantic Segmentation of High-Resolution Remote Sensing Images. IEEE Geoscience and Remote Sensing Letters,19: 1-5 [DOI: 10.1109/LGRS.2021.3063799http://dx.doi.org/10.1109/LGRS.2021.3063799]
Bokhovkin A, Burnaev E. 2019 Boundary Loss for Remote Sensing Imagery Semantic Segmentation. Advances in Neural Networks–ISNN 2019: 16th International Symposium on Neural Networks. [DOI: .1007/978-3-030-22808-8_38http://dx.doi.org/.1007/978-3-030-22808-8_38]
Chen J N, Lu Y Y, Yu Q H, Luo X D, Adell E, Wang Y, Lu L, Yuille A L and Zhou Y Y. 2021. TransUNet: transformers make strong encoders for medical image segmentation. [DOI: 10.48550/arXiv.2102.04306http://dx.doi.org/10.48550/arXiv.2102.04306]
Chen L C, Papandreou G, Kokkinos I, Murphy K, and Yuille A L. 2016. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 834-848 [DOI: 10.1109/TPAMI.2017.2699184http://dx.doi.org/10.1109/TPAMI.2017.2699184]
Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. [DOI: 10.48550/arXiv.1706.05587http://dx.doi.org/10.48550/arXiv.1706.05587]
Chen L C, Zhu Y K, Papandreou G, Schroff F and Adam H. 2018. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. European Conference on Computer Vision, ECCV (7) 2018: 833-851 [DOI: https://doi.org/10.1007/978-3-030-01234-2_49https://doi.org/10.1007/978-3-030-01234-2_49]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigoid G, Gelly S, Uszkoreit J and Houlsby N. 2020. An image is worth 16x16 words: transformers for image recognition at scale. 9th International Conference on Learning Representations. [DOI: 10.48550/arXiv.2010.11929http://dx.doi.org/10.48550/arXiv.2010.11929]
Han K, Xiao A, Wu E H, Guo J Y, Xu C j and Wang Y H. 2021. Transformer in Transformer. Neural Information Processing Systems. [DOI: https://doi.org/10.48550/arXiv.2103.00112https://doi.org/10.48550/arXiv.2103.00112]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Holecz F, Barbieri M, Cantone A, Pasquali P, and Monaco S. 2009. Synergetic use of multi-temporal ALOS PALSAR and ENVISAT ASAR data for topographic/land cover mapping and monitoring at national scale in Africa. IEEE International Geoscience & Remote Sensing Symposium, IEEE: II-5-II-8 [DOI: 10.1109/IGARSS.2009.5417985http://dx.doi.org/10.1109/IGARSS.2009.5417985]
Kirillov A, Wu Y X, He K M and Girshick R.2020. Pointrend: image segmentation as rendering. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE: 9796-9805 [DOI: 10.1109/CVPR42600.2020.00982http://dx.doi.org/10.1109/CVPR42600.2020.00982]
Li A J, Jiao L C, Zhu H, Li L L and Liu F. 2022. Multitask Semantic Boundary Awareness Network for Remote Sensing Image Segmentation. IEEE Transactions on Geoscience and Remote Sensing ,60:1-14[DOI: 10.1109/TGRS.2021.3050885http://dx.doi.org/10.1109/TGRS.2021.3050885]
Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, Lin S and Guo B N.2021. Swin transformer: hierarchical vision transformer using shifted windows. 2021 IEEE/CVF International Conference on Computer Vision, IEEE: 9992-10002 [DOI: 10.1109/ICCV48922.2021.00986http://dx.doi.org/10.1109/ICCV48922.2021.00986]
Luo X, Tong X H and Pan H Y.2020. Integrating multiresolution and multitemporal sentinel-2 imagery for land-cover mapping in the xiongan new area, china. IEEE Transactions on Geoscience and Remote Sensing, 59(2):1029-1040 [DOI: 10.1109/TGRS.2020.2999558http://dx.doi.org/10.1109/TGRS.2020.2999558]
Luo Z,Li M and Zhang D Z. 2022. Building detection based on a boundary-regulated network and watershed segmentation. National Remote Sensing Bulletin,26(7): 1459-1468
罗壮,李明,张德朝.2022.结合边界约束网络和分水岭分割算法的建筑物提取.遥感学报,26(7):1459-1468 [DOI:10.11834/jrs.20219335http://dx.doi.org/10.11834/jrs.20219335]
Milletari F, Navab N, and Ahmadi S A. 2016. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. 2016 Fourth International Conference on 3D Vision, IEEE: 565-571 [DOI: 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79]
Min L, Gao K, Li W, Wang H, Li T, Wu Q and Jiao J C. 2020. Overview of optical remote sensing image segmentation technology. Spacecraft Recovery & Remote Sensing, 41(6), 13
闵蕾, 高昆, 李维, 王红, 李婷, 吴穹,焦建超. 2020. 光学遥感图像分割技术综述. 航天返回与遥感, 41(6), 13 [DOI: 10.3969/j.issn.1009-8518.2020.06.001http://dx.doi.org/10.3969/j.issn.1009-8518.2020.06.001]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015, vol 9351 [DOI: https://doi.org/10.1007/978-3-319-24574-4_28https://doi.org/10.1007/978-3-319-24574-4_28]
Rottensteiner F, Sohn G, Jung J, Gerke M, Baillard C, Benitez S, and Breitkopf U. 2012. The ISPRS benchmark on urban object classification and 3D building reconstruction. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences. [DOI:10.5194/isprsannals-I-3-293-2012http://dx.doi.org/10.5194/isprsannals-I-3-293-2012]
Shelhamer E, Long J and Darrell T. Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4): 640-651 [DOI: 10.1109/TPAMI.2016.2572683http://dx.doi.org/10.1109/TPAMI.2016.2572683]
Simonyan K, Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition. Proceedings of the 3rd International Conference on Learning Representations. [DOI:10.48550/arXiv.1409.1556http://dx.doi.org/10.48550/arXiv.1409.1556]
Srinivas A, Lin T Y, Parmar N, Shlens J, Abbeel P and Vaswani A. 2021. Bottleneck Transformers for Visual Recognition. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE: 16514-16524 [DOI: 10.1109/CVPR46437.2021.01625http://dx.doi.org/10.1109/CVPR46437.2021.01625]
Sun X, Shi A, Huang H and Mayer H. 2020. BAS4Net: Boundary-Aware Semi-Supervised Semantic Segmentation Network for Very High Resolution Remote Sensing Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13: 5398-5413 [DOI: 10.1109/JSTARS.2020.3021098http://dx.doi.org/10.1109/JSTARS.2020.3021098]
Sun X D, Xia M and Dai T F.. 2022. Controllable Fused Semantic Segmentation with Adaptive Edge Loss for Remote Sensing Parsing. Remote Sensing, 14(1): 207 [DOI: https://doi.org/10.3390/rs14010207https://doi.org/10.3390/rs14010207]
Takikawa T, Acuna D, Jampani V, and Fidler S. 2020. Gated-SCNN: Gated Shape CNNs for Semantic Segmentation. 2019 IEEE/CVF International Conference on Computer Vision, IEEE: 5228-5237 [DOI: 10.1109/ICCV.2019.00533http://dx.doi.org/10.1109/ICCV.2019.00533]
Tan M X, Le Q V.2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research, ICML2019: 6105-6114 [DOI: 10.48550/arXiv.1905.11946http://dx.doi.org/10.48550/arXiv.1905.11946]
Wang B B, Yang H S, Wang J F and Xu C H. 2015. Urban land use status information extraction based on high resolution remote sensing image. Mapping and Spatial Geographic Information, 38 (4): 3
王冰冰, 杨鹤松, 王军锋, & 徐成华. 2015. 基于高分遥感影像的城市用地现状信息提取. 测绘与空间地理信息, 38(4): 3 [DOI: 10.3969/j.issn.1672-5867.2015.04.021http://dx.doi.org/10.3969/j.issn.1672-5867.2015.04.021]
Wu Q Q,Wang S,Wang B and Wu Y L. 2022. Road extraction method of high-resolution remote sensing image on the basis of the spatial information perception semantic segmentation model. National Remote Sensing Bulletin, 26(9):1872-1885
吴强强,王帅,王彪,吴艳兰 .2022. 空间信息感知语义分割模型的高分辨率遥感影像道路提取 . 遥感学报,26(9):1872-1885 [DOI:10.11834/jrs.20210021http://dx.doi.org/10.11834/jrs.20210021]
Xie E Z, Wang W H, Yu Z D, Anandkumar A, Alvarez J M and Luo P. 2021. Segformer: simple and efficient design for semantic segmentation with transformers. Neural Information Processing Systems [DOI:10.48550/arXiv.2105.15203http://dx.doi.org/10.48550/arXiv.2105.15203]
Zeng W X, Ma Y, Ding Y, Zhang S Q and Li W G. 2021. A survey of image semantic segmentation methods based on deep learning. Modern Computer (21), 8
曾文献, 马月, 丁宇, 张淑青, 李伟光. 2021. 基于深度学习的图像语义分割方法研究综述. 现代计算机(21), 8 [DOI: 10.3969/j.issn.1007-1423.2021.21.022http://dx.doi.org/10.3969/j.issn.1007-1423.2021.21.022]
Zhao H S, Shi J P, Qi X J, Wang X G and Jia J Y.2017. Pyramid Scene Parsing Network. 2017 IEEE Conference on Computer Vision and Pattern Recognition, IEEE: 6230-6239 [DOI: 10.1109/CVPR.2017.660http://dx.doi.org/10.1109/CVPR.2017.660]
Zheng S X, Lu J C, Zhao H S, Zhu X T, Luo Z L, Wang Y B, Fu Y W, Feng J F, Xiang T, Toor P H S and Li Z. 2020. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE: 6877-6886 [DOI: 10.1109/CVPR46437.2021.00681http://dx.doi.org/10.1109/CVPR46437.2021.00681]
Zhou Z W, Siddiquee M M R, Tajbakhsh N and Liang J M. 2020. Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Transactions on Medical Imaging, 39(6): 1856-1867 [DOI: 10.1109/TMI.2019.2959609http://dx.doi.org/10.1109/TMI.2019.2959609]
相关作者
相关机构