Weakly supervised scale adaptation data augmentation for scene classification of high-resolution remote sensing images
- Vol. 27, Issue 12, Pages: 2815-2830(2023)
Published: 07 December 2023
DOI: 10.11834/jrs.20221481
扫 描 看 全 文
浏览全部资源
扫码关注微信
Published: 07 December 2023 ,
扫 描 看 全 文
王梨名,祁昆仑,杨超,吴华意.2023.弱监督尺度自适应增强的高分辨率遥感影像场景分类.遥感学报,27(12): 2815-2830
Wang L M, Qi K L, Yang C and Wu H Y. 2023. Weakly supervised scale adaptation data augmentation for scene classification of high-resolution remote sensing images. National Remote Sensing Bulletin, 27(12):2815-2830
遥感图像中同一种地物可能对应不同大小尺寸,而卷积核感受野大小固定严重影响了卷积神经网络在遥感场景分类中的性能。针对上述尺度效应问题,本文提出了一种面向高分辨率遥感影像场景分类的弱监督尺度自适应增强网络WSADAN(Weakly-supervised Scale Adaptation Data Augmentation Network),主要包括尺度生成和尺度融合两个模块。尺度生成模块利用卷积神经网络提取的原图像高层特征学习出适合于不同样本实例的最佳尺度参数;而尺度融合模块通过融合原尺度图像和最佳尺度图像的高层特征进行精化去除冗余,挖掘出不同尺度下特征间的关联信息。最后,联合多尺度特征表达输入到全连接层实现场景类别的预测。本文采用RSSCN7、AID和NWPU这3个遥感场景分类数据集验证方法的有效性,结果表明所提出的网络模型优于传统卷积神经网络,尤其对于尺度变化较大的类别性能提升最为明显。
Scene classification of remote sensing images aims to assign a meaningful label to a given image. In recent years
Convolutional Neural Networks (CNNs)-based methods make a breakthrough and substantially outperform traditional methods in scene classification tasks of remote sensing images. However
obtaining features under different scales in remote sensing images is difficult due to the fixed receptive field of CNNs. This complexity seriously affects the performance of CNNs in scene classification of remote sensing images. This study proposes a method to learn the optimal scales for different scene image instances in a weakly supervised manner.
A Weakly Supervised Scale Adaptive Data Augmentation Network (WSADAN) is proposed to capture feature information at different scales of remote sensing scenes
and a scale generation module and a scale fusion module are designed to improve the robustness. The scale generation module learns the optimal scale parameters based on the CNN features of the original image. The scale fusion module filters the CNN features of images with original and optimal scales to remove the noise and then deeply fuses them to exploit the correlation between features at different scales. The deeply fused multi-scale features are input into a fully connected layer to predict categories of scene images.
The effectiveness of the scale generation and scale fusion modules is verified by ablation experiments. The accuracy of WSADANSGM compared with the baseline improves by 0.94% and 0.89% for the 20% and 50% training data ratios of RSSCN7 dataset
1.27% and 0.87% for the 20% and 50% training data ratios of AID dataset
and 1.09% and 0.71% for the 10% and 20% training data ratios of NWPU dataset
respectively. Compared with WSADANSGM
WSADANSGM+SFM improves by 1.65% and 1.32% for the RSSCN7 dataset at 20% and 50% training data ratios
1.65% and 1.26% for the AID dataset at 20% and 50% training data ratios
and 1.75% and 1.42% for the NWPU dataset at 10% and 20% training data ratios
respectively. In the experiment for scene scale change analysis
the classification accuracy of our method is higher than the baseline at any scale of image
which proves that our method can learn certain image scale information and has strong scale adaptation ability. We use three datasets for remote sensing scene classification
namely
RSSCN7
AID
and NWPU
for the experiments. On the RSSCN7 dataset
the overall accuracies are 91.65% and 94.07% with the training ratios of 20% and 50% for WSADAN-VGG16. For WSADAN-ResNet50
the corresponding accuracies are 92.69% and 94.82%. On the AID dataset
the overall accuracies are 92.78% and 95.18% with the training ratios of 20% and 50% for WSADAN-VGG16. For WSADAN-ResNet50
the corresponding accuracies are 93.73% and 95.88%. On the NWPU dataset
the overall accuracies are 87.01% and 90.44% with the training ratios of 10% and 20% for WSADAN-VGG16. For WSADAN-ResNet50
the corresponding accuracies are 90.71% and 92.63%.
The proposed method can learn CNN features at a wider range of scales without manual multi-scale selection for different datasets. The performance of the proposed method is better than that of traditional CNNs
especially for the scene categories containing objects with large-scale variations.
遥感场景分类深度学习卷积神经网络弱监督多尺度数据增强
remote sensingscene classificationdeep learningconvolutional neural networksweakly supervisionmulti-scaledata augmentation
Alhichri H, Alajlan N, Bazi Y and Rabczuk T. 2018. Multi-scale convolutional neural network for remote sensing scene classification//Proceedings of the 2018 IEEE International Conference on Electro/Information Technology (EIT). Rochester: IEEE: 1-5 [DOI: 10.1109/EIT.2018.8500107http://dx.doi.org/10.1109/EIT.2018.8500107]
Cao R, Fang L Y, Lu T and He N J. 2021. Self-attention-based deep feature fusion for remote sensing scene classification. IEEE Geoscience and Remote Sensing Letters, 18(1): 43-47 [DOI: 10.1109/LGRS.2020.2968550http://dx.doi.org/10.1109/LGRS.2020.2968550]
Chen L J, Yang W, Xu K and Xu T. 2011. Evaluation of local features for scene classification using VHR satellite images//Proceedings of 2011 Joint Urban Remote Sensing Event. Munich: IEEE: 385-388 [DOI: 10.1109/JURSE.2011.5764800http://dx.doi.org/10.1109/JURSE.2011.5764800]
Chen W T, Li X J, He H X and Wang L Z. 2018. Assessing different feature sets’ effects on land cover classification in complex surface-mined landscapes by ZiYuan-3 satellite imagery. Remote Sensing, 10(1): 23 [DOI: 10.3390/rs10010023http://dx.doi.org/10.3390/rs10010023]
Chen Y Y, Ming D P and Lv X W. 2019. Superpixel based land cover classification of VHR satellite image combining multi-scale CNN and scale parameter estimation. Earth Science Informatics, 12(3): 341-363 [DOI: 10.1007/s12145-019-00383-2http://dx.doi.org/10.1007/s12145-019-00383-2]
Cheng G, Guo L, Zhao T Y, Han J W, Li H H and Fang J. 2013. Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA. International Journal of Remote Sensing, 34(1): 45-59 [DOI: 10.1080/01431161.2012.705443http://dx.doi.org/10.1080/01431161.2012.705443]
Cheng G, Han J W and Lu X Q. 2017. Remote sensing image scene classification: benchmark and state of the art. Proceedings of the IEEE, 105(10): 1865-1883 [DOI: 10.1109/JPROC.2017.2675998http://dx.doi.org/10.1109/JPROC.2017.2675998]
Gao Y, Shi J, Li J and Wang R Y. 2020. Remote sensing scene classification with dual attention-aware network//Proceedings of the IEEE 5th International Conference on Image, Vision and Computing (ICIVC). Beijing: IEEE: 171-175 [DOI: 10.1109/ICIVC50857.2020.9177460http://dx.doi.org/10.1109/ICIVC50857.2020.9177460]
Gu Y T, Wang Y T and Li Y S. 2019. A survey on deep learning-driven remote sensing image scene understanding: scene classification, scene retrieval and scene-guided object detection. Applied Sciences, 9(10): 2110 [DOI: 10.3390/app9102110http://dx.doi.org/10.3390/app9102110]
Han X B, Zhong Y F, Cao L Q and Zhang L P. 2017. Pre-trained AlexNet architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sensing, 9(8): 848 [DOI: 10.3390/rs9080848http://dx.doi.org/10.3390/rs9080848]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
He N J, Fang L Y, Li S T, Plaza J and Plaza A. 2020. Skip-connected covariance network for remote sensing scene classification. IEEE Transactions on Neural Networks and Learning Systems, 31(5): 1461-1474 [DOI: 10.1109/TNNLS.2019.2920374http://dx.doi.org/10.1109/TNNLS.2019.2920374]
Kingma D P and Ba J. 2014. Adam: a method for stochastic optimization//Proceedings of the 3rd International Conference on Learning Representations. San Diego: [s.n.]
Längkvist M, Kiselev A, Alirezaie M and Loutfi A. 2016. Classification and segmentation of satellite orthoimagery using convolutional neural networks. Remote Sensing, 8(4): 329 [DOI: 10.3390/rs8040329http://dx.doi.org/10.3390/rs8040329]
Liu Q S, Hang R L, Song H H, Zhu F P, Plaza J and Plaza A. 2016. Adaptive deep pyramid matching for remote sensing scene classification. arXiv:1611.03589 [DOI: 10.48550/arXiv.1611.03589http://dx.doi.org/10.48550/arXiv.1611.03589]
Liu Y S, Liu Y B and Ding L W. 2018. Scene classification based on two-stage deep feature fusion. IEEE Geoscience and Remote Sensing Letters, 15(2): 183-186 [DOI: 10.1109/LGRS.2017.2779469http://dx.doi.org/10.1109/LGRS.2017.2779469]
Luo B, Jiang S J and Zhang L P. 2013. Indexing of remote sensing images with different resolutions by multiple features. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 6(4): 1899-1912 [DOI: 10.1109/JSTARS.2012.2228254http://dx.doi.org/10.1109/JSTARS.2012.2228254]
Lv Z H, Li X M, Zhang B Y, Wang W X, Zhu Y Y, Hu J X and Feng S Z. 2016. Managing big city information based on WebVRGIS. IEEE Access, 4: 407-415 [DOI: 10.1109/ACCESS.2016.2517076http://dx.doi.org/10.1109/ACCESS.2016.2517076]
Ma X Y, Wang L M, Qi K L and Zheng G Z. 2021. Remote sensing image scene classification method based on multi-scale cyclic attention network. Earth Science, 46(10): 3740-3752
马欣悦, 王梨名, 祁昆仑, 郑贵洲. 2021. 基于多尺度循环注意力网络的遥感影像场景分类方法. 地球科学, 46(10): 3740-3752 [DOI: 10.3799/dqkx.2020.365http://dx.doi.org/10.3799/dqkx.2020.365]
Martha T R, Kerle N, Van Westen C J, Jetten V and Kumar K V. 2011. Segment optimization and data-driven thresholding for knowledge-based landslide detection by object-based image analysis. IEEE Transactions on Geoscience and Remote Sensing, 49(12): 4928-4943 [DOI: 10.1109/TGRS.2011.2151866http://dx.doi.org/10.1109/TGRS.2011.2151866]
Qi K L, Yang C, Guan Q F, Wu H Y and Gong J Y. 2017. A multiscale deeply described correlatons-based model for land-use scene classification. Remote Sensing, 9(9): 917 [DOI: 10.3390/rs9090917http://dx.doi.org/10.3390/rs9090917]
Qi K L, Yang C, Hu C L, Zhai H, Guan Q F and Shen S Y. 2021. A multi-level improved circle pooling for scene classification of high-resolution remote sensing imagery. Neurocomputing, 462: 506-522 [DOI: 10.1016/j.neucom.2021.08.022http://dx.doi.org/10.1016/j.neucom.2021.08.022]
Qian X L, Li J, Cheng G, Yao X W, Zhao S N, Chen Y B and Jiang L Y. 2018. Evaluation of the effect of feature extraction strategy on the performance of high-resolution remote sensing image scene classification. Journal of Remote Sensing, 22(5): 758-776
钱晓亮, 李佳, 程塨, 姚西文, 赵素娜, 陈宜滨, 姜利英. 2018. 特征提取策略对高分辨率遥感图像场景分类性能影响的评估. 遥感学报, 22(5): 758-776 [DOI: 10.11834/jrs.20188015http://dx.doi.org/10.11834/jrs.20188015]
Shi H H, Xu Y N, Teng W X and Wang N. 2021. Scene classification of high-resolution remote sensing imagery based on deep transfer deformable convolutional neural networks. Acta Geodaetica et Cartographica Sinica. 50(05): 652-663
施慧慧, 徐雁南, 滕文秀, 王妮. 2021. 高分辨率遥感影像深度迁移可变形卷积的场景分类法. 测绘学报, 50(5): 652-663 [DOI: 10.11947/j.AGCS.2021.20200190http://dx.doi.org/10.11947/j.AGCS.2021.20200190]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego: IEEE: 7-9
Sun X L, Zhu Q Q and Qin Q Q. 2021. A multi-level convolution pyramid semantic fusion framework for high-resolution remote sensing image scene classification and annotation. IEEE Access, 9: 18195-18208 [DOI: 10.1109/ACCESS.2021.3052977http://dx.doi.org/10.1109/ACCESS.2021.3052977]
Wu H, Liu B Z, Su W H, Zhang W C and Sun J G. 2016. Deep filter banks for land-use scene classification. IEEE Geoscience and Remote Sensing Letters, 13(12): 1895-1899 [DOI: 10.1109/LGRS.2016.2616440http://dx.doi.org/10.1109/LGRS.2016.2616440]
Xia G S, Hu J W, Hu F, Shi B G, Bai X, Zhong Y F, Zhang L P and Lu X Q. 2017. AID: a benchmark data set for performance evaluation of aerial scene classification. IEEE Transactions on Geoscience and Remote Sensing, 55(7): 3965-3981 [DOI: 10.1109/TGRS.2017.2685945http://dx.doi.org/10.1109/TGRS.2017.2685945]
Yang Y and Newsam S. 2008. Comparing SIFT descriptors and gabor texture features for classification of remote sensed imagery//Proceedings of the 15th IEEE International Conference on Image Processing. San Diego: IEEE: 1852-1855 [DOI: 10.1109/ICIP.2008.4712139http://dx.doi.org/10.1109/ICIP.2008.4712139]
Yang Y and Newsam S. 2010. Bag-of-visual-words and spatial extensions for land-use classification//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. San Jose: ACM: 270-279 [DOI: 10.1145/1869790.1869829http://dx.doi.org/10.1145/1869790.1869829]
Yao Y Q, Cheng G, Xie X X and Han J W. 2021. Optical remote sensing image object detection based on multi-resolution feature fusion. National Remote Sensing Bulletin, 25(5): 1124-1137
姚艳清, 程塨, 谢星星, 韩军伟. 2021. 多分辨率特征融合的光学遥感图像目标检测. 遥感学报, 25(5): 1124-1137 [DOI: 10.11834/jrs.20210505http://dx.doi.org/10.11834/jrs.20210505]
Yu D H, Zhang B M, Zhao C, Guo H T and Lu J. 2020. Scene classification of remote sensing image using ensemble convolutional neural network. Journal of Remote Sensing, 24(6): 717-727
余东行, 张保明, 赵传, 郭海涛, 卢俊. 2020. 联合卷积神经网络与集成学习的遥感影像场景分类. 遥感学报, 24(6): 717-727 [DOI: 10.11834/jrs.20208273http://dx.doi.org/10.11834/jrs.20208273]
Yu Y L, Li X Z and Liu F X. 2020. Attention GANs: unsupervised deep feature learning for aerial scene classification. IEEE Transactions on Geoscience and Remote Sensing, 58(1): 519-531 [DOI: 10.1109/TGRS.2019.2937830http://dx.doi.org/10.1109/TGRS.2019.2937830]
Zan L Y, Li B P, Lu K X, Chen Z C and Zhang B. 2021. Intelligent detection of chemical plant based on poly FPN neural network model. National Remote Sensing Bulletin
昝露洋, 李柏鹏, 卢凯旋, 陈正超, 张兵. 2021. 基于Poly-FPN神经网络模型的化工厂智能检测. 遥感学报 [DOI: 10.11834/jrs.20210005http://dx.doi.org/10.11834/jrs.20210005]
Zeng D, Chen S J, Chen B Y and Li S Y. 2018. Improving remote sensing scene classification by integrating global-context and local-object features. Remote Sensing, 10(5): 734 [DOI: 10.3390/rs10050734http://dx.doi.org/10.3390/rs10050734]
Zhang J, Zhang M, Shi L K, Yan W J and Pan B. 2019. A multi-scale approach for remote sensing scene classification based on feature maps selection and region representation. Remote Sensing, 11(21): 2504 [DOI: 10.3390/rs11212504http://dx.doi.org/10.3390/rs11212504]
Zhang S Y. 2020. High-Resolution Remote Sensing Image Land Cover Classification Based on Deep Learning and Multi-Scale and Multi-Feature Fusion. Hangzhou: Zhejiang University
张书瑜. 2020. 基于深度学习和多尺度多特征融合的高分辨率遥感地表覆盖分类研究. 杭州: 浙江大学 [DOI: 10.27461/d.cnki.gzjdx.2020.001324http://dx.doi.org/10.27461/d.cnki.gzjdx.2020.001324]
Zhao L J, Tang P and Huo L Z. 2016. Feature significance-based multibag-of-visual-words model for remote sensing image scene classification. Journal of Applied Remote Sensing, 10(3): 035004 [DOI: 10.1117/1.JRS.10.035004http://dx.doi.org/10.1117/1.JRS.10.035004]
Zhao W Z and Du S H. 2016. Learning multiscale and deep representations for classifying remotely sensed imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 113: 155-165 [DOI: 10.1016/j.isprsjprs.2016.01.004http://dx.doi.org/10.1016/j.isprsjprs.2016.01.004]
Zheng X T, Yuan Y and Lu X Q. 2019. A deep scene representation for aerial scene classification. IEEE Transactions on Geoscience and Remote Sensing, 57(7): 4799-4809 [DOI: 10.1109/TGRS.2019.2893115http://dx.doi.org/10.1109/TGRS.2019.2893115]
Zheng Y. 2015. Methodologies for cross-domain data fusion: an overview. IEEE Transactions on Big Data, 1(1): 16-34 [DOI: 10.1109/TBDATA.2015.2465959http://dx.doi.org/10.1109/TBDATA.2015.2465959]
Zhong Y F, Zhu Q Q and Zhang L P. 2015. Scene classification based on the multifeature fusion probabilistic topic model for high spatial resolution remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 53(11): 6207-6222 [DOI: 10.1109/TGRS.2015.2435801http://dx.doi.org/10.1109/TGRS.2015.2435801]
Zhu Q Q, Zhong Y F, Zhao B, Xia G S and Zhang L P. 2016. Bag-of-visual-words scene classifier with local and global features for high spatial resolution remote sensing imagery. IEEE Geoscience and Remote Sensing Letters, 13(6): 747-751 [DOI: 10.1109/LGRS.2015.2513443http://dx.doi.org/10.1109/LGRS.2015.2513443]
Zhu Q Q, Li Z, Zhang Y N, Li J L, Du Y Q, Guan Q F and Li D R. 2021. Global-Local-Aware conditional random fields based building extraction for high spatial resolution remote sensing images. National Remote Sensing Bulletin, 25(7): 1422-1433
朱祺琪, 李真, 张亚男, 李佳伦, 杜禹强, 关庆锋, 李德仁. 2021. 全局局部细节感知条件随机场的高分辨率遥感影像建筑物提取. 遥感学报, 25(7): 1422-1433 [DOI: 10.11834/jrs.20210360http://dx.doi.org/10.11834/jrs.20210360]
Zou Q, Ni L H, Zhang T and Wang Q. 2015. Deep learning based feature selection for remote sensing scene classification. IEEE Geoscience and Remote Sensing Letters, 12(11): 2321-2325 [DOI: 10.1109/LGRS.2015.2475299http://dx.doi.org/10.1109/LGRS.2015.2475299]
相关文章
相关作者
相关机构