摘要:With the introduction of artificial-intelligence technologies such as deep learning into the field of optical remote-sensing detection, various algorithms have emerged. The use of these algorithms has gradually formed a new paradigm of data-driven optical remote-sensing image object detection. Consequently, high-quality remote-sensing data has become a prerequisite and a necessary resource for researching these paradigm algorithms. highlighting the increasing importance of remote-sensing data. To date, numerous optical remote-sensing image object detection datasets have been published by major research institutions domestically and internationally. These datasets have laid the foundation for the development of deep learning-based remote-sensing image detection tasks. However, no comprehensive summarization and analysis of the published optical remote-sensing image detection datasets have been conducted by scholars. Therefore, this paper aimed to provide a comprehensive review of the published datasets and an overview of algorithm applications. We also aimed to provide a reference for subsequent research in related fields.This paper presents an overview and synthesis of the optical remote-sensing image object detection datasets published between 2008 and 2023. The synthesis is based on an extensive and comprehensive survey of literature in the field. By reviewing and analyzing these datasets, we enable a comprehensive understanding of the progress and trends in optical remote-sensing image object detection dataset research.This paper categorizes the optical remote-sensing image object detection datasets published from 2008 to 2023 based on the annotation method. A comprehensive description of 11 representative datasets is provided, and all dataset information are summarized in tabular form. The analysis considers the information in the datasets themselves and also the spatial and spectral resolution of the images in the datasets. Other basic information including the number of categories, number of images, number of instances, and image-width information are also considered. This analysis effectively demonstrates the trend toward high quality, large scale, and multi-category development of object-detection datasets for optical remote-sensing images. Additionally, we provide an overview of the development and application of algorithms related to published datasets from different perspectives (e.g., horizontal bounding box object detection and rotated bounding box object detection), as well as a subdivision of detection directions (e.g., small object detection and fine-grained detection). Our findings confirm the influential role of remote-sensing data in driving algorithmic advances.In summary, we offer a comprehensive review of optical remote-sensing image object detection datasets from various perspectives. To our best knowledge, this comprehensive review is the first one on such datasets in the field. The work serves as a valuable reference for subsequent research on deep learning-based optical remote-sensing image object detection, providing insights into data availability and research directions. This study is expected to contribute to the advancement of this field by offering a solid foundation for further investigation and innovation.
关键词:deep learning;optical remote sensing imagery;data source;object detection;development of datasets
摘要:Military aircraft recognition in remote sensing images locates military aircraft in remote sensing images and classify them at a fine-grained level. It plays a vital role in reconnaissance and early warning, intelligence analysis, and other fields. However, the development of military aircraft recognition in remote sensing images is relatively slow due to the lack of publicly available datasets. Therefore, constructing a high-quality and large-scale military aircraft recognition dataset is important.This study constructs a public remote sensing image military aircraft recognition dataset called MAR20 to promote the research progress in this field. The dataset has the following characteristics: (1) MAR20 is currently the largest remote sensing image military aircraft recognition dataset, which includes 3842 images, 20 types, and 22341 instances. Each instance has a horizontal bounding box and also an oriented bounding box. (2) Given that all fine-grained types belong to the aircraft category, different types of aircraft often have similar characteristics, which result in high similarity of different types of targets. (3) Large intra-class differences exist between targets of the same type due to the influence of climate, season, illumination, occlusion, and even the atmospheric scattering in the process of remote sensing imaging.To establish a benchmark for military aircraft recognition in remote sensing images, this paper study evaluates seven commonly used horizontal object recognition methods, namely, Faster R-CNN, RetinaNet, ATSS, FCOS, Cascade R-CNN, TSD, and Double-Head, as well as eight oriented object recognition methods, namely, Faster R-CNN-O, RetinaNet-O, RoI Transformer, Gliding Vertex, Double-Head-O, Oriented R-CNN, FCOS-O, and S2A-Net, on the MAR20 dataset. Through experimental comparisons in the tasks of horizontal object recognition and oriented object recognition, two-stage methods are proven to be more effective in target recognition than one-stage methods.In this study, 3842 high-resolution remote sensing images were collected from 60 military airports around the world through Google Earth, and a large-scale publicly available remote sensing image military aircraft recognition dataset, named MAR20, was established. In terms of data annotation, MAR20 provides two annotation methods, namely, horizontal bounding boxes and oriented bounding boxes, which correspond to the tasks of horizontal target recognition and oriented target recognition. We hope that the MAR20 dataset established in this study could promote the research progress in this field. MAR20 can be downloaded athttps://gcheng-nwpu.github.io/.
摘要:With the ongoing development of artificial intelligence technology, deep learning methods have become increasingly important role in the field of ship detection. However, the false alarms and missed detections that appear in deep learning algorithms hinder the application of technology in the field of ship detection. Although the classical deep learning methods can effectively deal with a single-background sea surface, the classical models can easily yield false alarms on shore when faced with data under complex backgrounds. In custom training, the model often tends to overly emphasize some salient features, which leads to feature overfitting. Detection can easily be missed when these salient features change. In the process of forward propagation of the model to the input, different network layers in the model generate corresponding mappings or feature maps from the input. Fully utilizing the semantic and spatial information of the feature maps is an effective way to reduce false alarms and missed detections. Compared with the traditional model, our proposed Feature Map Reinforcement Network (FMRNnet) can fully utilize feature maps to generate adaptive feature map masks and water–land segmentation masks. This method ultimately reduces false alarms and missed detections by avoiding feature overfitting of the model and weakening the effects caused by complex backgrounds. In FMRNnet, we design the Self Feature-map Mask Module (SFMM), which can selectively utilize the feature map through the attention mechanism for generating an adaptive mask. The mask prevents the model from focusing on a single feature point, which prevents feature overfitting. We also propose a Feature-map Sea-Land Segmentation Module (FSSM) that is parallel to SFMM. It reduces the false alarms of ship targets appearing in the land area by introducing the fusion between the water-land segmentation mask and the feature map. The experimental results, when compared with SOTA algorithms on publicly available datasets, show that the performance of the proposed method in this study is excellent and outperforms that of other SOTA algorithms. After FMRNet is added, the 10-fold average mAP value of the detection algorithm ROI trans largely improves. This enhancement increases the mean value of baseline mAP from 86.1% to 90.8%, which surpasses that of other SOTA algorithms. Benefiting from the adaptive mask, the mAP value of the model including the SFMM module is 90.4%, which achieves a 4.2% improvement over the baseline. Owing to the priori knowledge learned from the water-land distribution, FSSM improves the precision and recall of the model, which results in a MAP value of 86.4%. For the task of ship detection, we propose a novel backbone network, that is, FMRNet based on Resnet. Our proposed SFMM module enables the model to discern the target from multiple features for avoiding the overfitting of salient features. We design the FSSM module to reduce the false alarms caused by complex backgrounds. Suppressing the non-water surface area reduces the confidence level of targets appearing in non-water surface. FSSM achieves the purpose of removing unreasonable false alarms while improving the accuracy of the model.
摘要:Nowadays, object detection methods based on deep learning are widely used in the interpretation of remote sensing images. The anchor-based methods usually need to design the anchor boxes first, which requires more detection steps and time cost. This study proposed an object detection method of remote sensing images based on the improved CenterNet. The method can simplify the object detection process and improve efficiency.The CenterNet uses a fully convolutional network to directly predict the heat map of the center points, widths, and heights of the corresponding objects, and the position offsets of the center points. The heat maps are used to generate the rough positions of the objects, and the offsets can fine-tune the positions to make them more accurate. The widths and heights further constitute the shape of the object boxes. The different heat maps decide the object categories. On the basis of CenterNet, the proposed method first adopts the ResNet with transposed convolution as the backbone network. The transposed convolution can expand the output feature maps, and ResNet can reduce the number of parameters in the backbone network compared with the Hourglass network. Second, the proposed method defines the length of Gaussian kernel under three limit conditions between the predicted and real boxes in CenterNet. The Gaussian kernel is applied to generate the heat map label, which is used for network training. Finally, the multi-head attention mechanism is introduced into the backbone network to learn the importance of each element in the feature maps. The weights assigned to the elements reflect their effectiveness, which makes the effective features concentrate in the regions of the object key points as much as possible.The experiments use mean Average Precision (mAP) to evaluate the object detection results on the multiple categories. All the experiments are conducted at the DIOR dataset. The results show that the CenterNet using the ResNet with transposed convolution is 1.4% higher than that using the Hourglass. The proposed calculation of the length of the Gaussian kernel can increase the mAP by 1.1%. The addition of attention mechanism can further improve the mAP by 1.5%. At the same time, the proposed method reduces the time cost by 31.9% compared with the conventional method.The experimental results show that the proposed method can improve detection accuracy without sacrificing the detection speed. The ablation experiments of different parts also show that the ResNet with transposed convolution, the designed calculation method of the length of the Gaussian kernel, and the attention mechanism can effectively improve the mAP. The comparison with other methods also proves that the proposed method is practical.
摘要:Oriented object detection in remote sensing images is an exceptionally challenging task that has elicited widespread attention. With the rapid advancement of deep learning, neural networks based on convolutional neural networks and self-attention networks (e.g., Transformers) have achieved remarkable progress in oriented object detection. However, the focus on boundary and salient feature information in oriented objects in remote sensing images is lacking. Specifically, extracting boundary information for objects with varying orientations is difficult, and the global dependency of salient features is sparse. To address these issues, we propose a method of small-object detection in remote sensing images on the basis of feature reassembly and self-attention. This method consists of a regression branch that incorporates spatial channel reassembly and a self-attention classification branch. The regression branch reassembles spatial information along the channel dimension and emphasizes boundary-sensitive information to achieve accurate localization of bounding boxes. The classification branch leverages self-attention with positional information to capture fundamentally discriminative object features, thus enhancing global feature dependencies for precise classification. Extensive experiments demonstrate the effectiveness and robustness of the proposed model and showcase its excellent performance on publicly available datasets, such as DOTA, HRSC2016, and SODA-A.
摘要:Oriented object detection is a basic task in the interpretation of high-resolution remote sensing images. Compared with general detectors, oriented detectors can locate instances with oriented bounding boxes, which are consistent with arbitrary-oriented ground truths in remote sensing images. Currently, oriented object detection has greatly progressed with the development of the convolutional neural network. However, this task is still challenging because of the extreme variation in object scales and arbitrary orientations. Most oriented detectors are evolved from horizontal detectors. They first generate horizontal proposals using the Region Proposal Network (RPN). Then, they classify these proposals into different categories and transform them into oriented bounding boxes. Despite their success, these detectors exploit only the annotations at the end of the network and do not fully utilize the angle and semantic information.This work proposes an Angle-based Region Proposal Network (ARPN), which learns the angle of objects and generates oriented proposals. The structure of ARPN is the same as that of RPN. However, for each proposal, instead of outputting four parameters for regression, ARPN generates five parameters, which are the center (x, y), shape (w, h), and angle (t). In the training, we first assign anchors with ground truths by the Intersection of Unions. Then, we directly supervise the ARPN with the shape and angle information of ground truths. We also propose a semantic branch to output image semantic results for utilizing the advantage of the semantic information. The semantic branch consists of two convolutional layers and is parallel with the detection head. We first assign objects to different scale levels according to their areas. Then, we create semantic labels in each scale and use them to supervise the semantic branch. With the semantic information supervision, the model will learn translation-variant features and improve accuracy. Moreover, the outputs of the semantic branch indicate the objectness in each place, which can filter out false positives of final predictions.We conduct comprehensive experiments on the DOTA dataset to validate the effectiveness of the proposed methods. In the data preparation, we first crop original images into 1024×1024 patches with the stride of 824. Compared with the baseline, the ARPN achieves a 2.2% increase in mAP, while the semantic branch contributes an additional 0.8% improvement in mAP. Finally, we combine both methods and achieve a 74.64% mAP, which is competitive with those obtained by other oriented object detectors. We visualize some results on the DOTA dataset. The results show that our method is highly effective for small objects and densely packed objects.We proposed ARPN and the semantic branch to utilize the multi-information in remote sensing images. The ARPN can directly generate oriented proposals, which can lead to better recall of oriented objects. The semantic branch increases the translation-variant property of the features. Experiments demonstrate the effectiveness of our method, which achieves a 74.64% mAP on the DOTA dataset. In the future works, we will focus on the model efficiency and the inference speed.
摘要:Sea fog is a common weather phenomenon at sea. It will reduce visibility at sea and greatly threaten maritime traffic and other operations. Traditional sea fog detection algorithms using satellite remote sensing have low accuracy, poor portability, and low automation. Although some existing deep learning-based sea fog monitoring algorithms have been improved, they do not consider the spectral characteristics of sea fog in different channels. The accuracy of sea fog monitoring is also low, especially in edge recognition.A daytime sea fog detection method, which is based on multi-scale feature fusion of generated adversarial network under attention mechanism, is proposed to improve the accuracy of sea fog detection. First, according to the spectral response of sea fog in different imaging channels of meteorological satellite, the satellite cloud images of different imaging channels that can reflect the characteristics of sea fog are selected as the input of the network. Meanwhile, a channel attention mechanism is introduced to calculate the weights of different input channels for prioritizing significant imaging channels within multichannel input. Then, a multi-scale feature fusion mechanism is adopted to fuse the feature maps of different levels of the network for obtaining the multi-scale features of the sea fog. In this way, the problem of losing detailed features in cloud images caused by the pooling operation of the traditional deep network can be solved. Finally, given the difficulty of traditional methods to accurately describe the edge of sea fog, a generation network for sea fog detection supervised by an adversarial network is used to accurately define the edge of sea fog and reduce the false alarm rate.This study takes the Yellow Sea and the Bohai Sea (116.5°—128.25°E,30°—42.5°N) as the research area. Given that March to June each year is the period of high incidence of sea fog in the Yellow Sea and the Bohai Sea, we produce a dataset based on the weather satellite monitoring report of the National Meteorological Center from March to June 2017—2020. After training the model, concerning the quantitative indicators of sea fog detection, our method achieves a probability of detection of 90.5%, a critical success index of 81.28%, and a false positive rate of 10.86%, which are better than those of other methods.The experimental results show that the proposed method can effectively improve the accuracy of sea fog identification, which is important for marine vessel navigation, fishery production, national defense, and military affairs.
关键词:Sea fog monitoring;satellite remote sensing;attention mechanism;Generate adversarial network;Multi-scale feature fusion
摘要:Hyperspectral anomaly detection is used to identify pixels with significant spectral contrast to their surrounding pixels. It plays a valuable role in military and civilian fields due to the characteristic that the priori spectral information is not required. The existing local contrast-based methods usually adopt dual rectangular window scheme for hyperspectral anomaly detection. However, they empirically set the size of dual window, which limits their generalization capability.A hyperspectral anomaly detection method via combining adaptive window saliency detection and improved superpixel segmentation is proposed in this study to address the abovementioned issue. An adversarial autoencoder is first introduced to reduce the dimension of the hyperspectral image for decreasing the computation complexity of the proposed method. Second, the dimension-reduced hyperspectral image is segmented by improved superpixel segmentation. The existing spectral distance measurements used in the superpixel segmentation are effective when the relationship between the spectral value and the intensity of each pixel is linear. However, this condition cannot be guaranteed in practical applications. The improved superpixel segmentation adopts the orthogonal projection divergence to measure the spectral distance for solving the aforementioned problem. Thereafter, an adaptive window-based saliency detection algorithm is proposed and used to obtain the initial detection results. Specifically, the size of the inner window is adaptively determined by the superpixels, which ensures that the pixels belonging to the same inner window are homogeneous. The outer window can be obtained by enlarging the inner window with fixed size. Finally, the domain transform recursive filter and thresholding operation are employed to optimize the initial detection results for reducing the false alarm rate.The comparisons between the orthogonal projection divergence and three common spectral distance measurements (Euclidean distance, spectral angular mapping, and spectral information divergence) in terms of AUC show that the orthogonal projection divergence-based method achieves the highest score on all five datasets. The comparisons between the adaptive window and traditional manual setting dual window in terms of AUC show that the adaptive window-based method achieves the highest score on all five datasets. Comprehensive comparisons between the proposed method and seven state-of-the-art methods on five public datasets are implemented to validate the overall performance of the proposed method. Specifically, the subjective comparisons show that the anomalous pixels detected by the proposed method are more precise and have stronger contrast to background regions. The objective comparisons demonstrate that the proposed method obtains the highest overall detection accuracy and offers the best separability between the anomalous and background pixels.Three conclusions can be derived from this study. First, the improved superpixel segmentation algorithm can enhance the segmentation results, and the proposed adaptive window scheme can increase the performance of saliency detection. Second, the proposed method has excellent detection accuracy, false alarm rate, and separability between the anomalous and background pixels. Finally, the overall performance of the proposed method is superior to that of state-of-the-art methods.
摘要:Currently, combining remote sensing imagery with deep learning is a growing trend in individual tree crown detection. RGB image is the most commonly used data type in detection. However, given that the color and texture of the tree crowns are generally close, distinguishing the crowns of different individuals by using only the color and texture information of RGB image in areas with high density of crowns is difficult. In this study, the elevation information is superimposed to improve the accuracy of individual tree crown detection by using RGB images. In the experiment, RGB image (color image) and DSM (digital surface model) were used as data sources, and band combination and double-source detection network model were used to combine RGB and DSM for individual tree crown detection. In the former method, band combination of RGB and DSM was conducted to generate GBD, RGD, and RBD images, and the three kinds of images were used for network training and testing. In the latter method, RGB and DSM were input into the double-source detection network model, and the detection results were obtained. FPN-Faster-R-CNN and Yolov3 were used for experiments in this study. Compared with RGB scheme as the control scheme (which uses only the color and texture information of ground objects for individual tree crown detection), the average accuracy of FPN-Faster-R-CNN in the GBD scheme, RBD scheme, and double-source detection network scheme increased by 3.36%, 2.45%, and 7.77%, respectively; it decreased by 0.17% in the RGD scheme. The average accuracy of Yolov3 in the GBD scheme, RBD scheme, and double-source detection network scheme increased by 0.72%, 0.14%, and 5.71%, respectively; it decreased by 0.98% in the RGD scheme. Under the two networks, the double-source detection network scheme achieved the best detection result in each scheme. Compared with the RGB scheme, the improvement in average accuracy of double-source detection network scheme showed a rising trend with the increase in forest density. Comparative analysis of the experimental results shows that proper combination and utilization of the color, texture, and elevation information of the ground objects is beneficial to improve the performance in the urban individual tree crown detection task based on deep learning.
关键词:remote sensing;individual tree crown detection;deep learning;urban;elevation;color image;UAV
摘要:Granular computing with data granulation as the basic is a frontier direction in the field of big data processing, which simulates human thinking and solves large-scale complex problems. It helps improve the accuracy and efficiency of pattern mining and knowledge discovery by means of structure and association. Therefore, incorporating this data analysis method into the process of mining information and discovering knowledge from remote sensing big data needs to be considered.In order to better implement intelligent processing and interpretation analysis of multi-source and multimodal remote sensing big data, and obtain spatiotemporal information that can serve precise applications, this study draws on the data processing thinking of granular computing, and builts a research path that follows the evolution route from visual understanding of external scene to relationship perspective of internal generation mechanism (spectrum analysis). The paper analyzes the granular structure of remote sensing big data and its multi-level and multi-granularity characteristics from three dimensions of space, time, and attribute. We further determine the corresponding granulation strategy based on the characteristics of remote sensing data. In addition, we build a methodology of remote sensing granular computing based on geo-parcels, which integrates the basic models of zonal-stratified perception, spatiotemporal collaborative inversion, and multi-granularity decision making. These models integrate geographical analysis methods, remote sensing mechanism models, and artificial intelligence algorithms. They also mine geographic information or knowledge including morphology, type, index, state, development trend, and mechanism of land geo-parcels.This study focuses on practical research guided by the application needs of precision agriculture. The case study shows that granular computing meets the requirements of intelligent computing of remote sensing big data from multiple perspectives. It is verified that the theory and method proposed in this study can systematically deconstruct and methodically address the multi-level complex problems of agricultural remote sensing. The case study also demonstrates its potential ability to support precise domain applications.This study develops a methodology of remote sensing intelligent computing under the guidance of granular computing. The corresponding problems and solutions in the aspects of space, time, and attribute are also analyzed. Based on the abovementioned work, we are confident that the proposed methodology of intelligent interpretation of remote sensing based on granular computing can effectively address and resolve complex surface cognitive problems in Earth observation through remote sensing.
摘要:Timely and accurate global crop mapping is important for global food security assessment. However, existing crop classification models are often targeted at specific regions, and their performance in other regions has not been fully evaluated. This study determined the critical period of crop growth in different regions to realize the effective transfer of the model in large-scale regions, and the remote sensing data during these critical periods were filtered such that the same crop in different regions showed similar characteristics on these remote sensing images. This way helps the model achieve a better transfer effect. In this study, the MultiResUNet, SegNet, DeepLab V3+, and U-Net models were trained using data from Northeast China, and the optimal F1 value for summer corn recognition in the study areas in North China can reach more than 0.97. This research also analyzed the factors that affected the generalization ability of the model. The issues addressed in this article include (1) using existing crop distribution data products as the ground truth samples for model training to solve the problem of lack of training samples for the deep learning model. We compared and analyzed the applicability of the models trained using the US Cropland Data Layer and Northeast crop distribution data products in North China. (2) We compared the generalization performance of depth models with different architectures. (3) We compared and analyzed the influence of different data types on the generalization ability of the model. (4) We comparatively analyzed the impact of crop phenology changes on the generalization ability of the model. Results show that MultiResUNet has better generalization performance than other networks when the plot size in the training and test areas varies significantly However, the generalization ability of MultiResUNet alone still cannot completely overcome the adverse effect of the change in plot spatial morphology on model migration. The crop distribution data products in Northeast China, which are more similar to the agricultural landscape in North China, need to be used for deep learning model training to obtain more accurate information of maize distribution in North China. Compared with TOA data, we found that SR data are more conducive to the spatial migration of the model at the transcontinental scale. Therefore, SR data should be given priority in large-scale crop mapping. This research provides a useful discussion for large-scale crop mapping research using only local samples.
摘要:Scene classification of remote sensing images aims to assign a meaningful label to a given image. In recent years, Convolutional Neural Networks (CNNs)-based methods make a breakthrough and substantially outperform traditional methods in scene classification tasks of remote sensing images. However, obtaining features under different scales in remote sensing images is difficult due to the fixed receptive field of CNNs. This complexity seriously affects the performance of CNNs in scene classification of remote sensing images. This study proposes a method to learn the optimal scales for different scene image instances in a weakly supervised manner.A Weakly Supervised Scale Adaptive Data Augmentation Network (WSADAN) is proposed to capture feature information at different scales of remote sensing scenes, and a scale generation module and a scale fusion module are designed to improve the robustness. The scale generation module learns the optimal scale parameters based on the CNN features of the original image. The scale fusion module filters the CNN features of images with original and optimal scales to remove the noise and then deeply fuses them to exploit the correlation between features at different scales. The deeply fused multi-scale features are input into a fully connected layer to predict categories of scene images.The effectiveness of the scale generation and scale fusion modules is verified by ablation experiments. The accuracy of WSADANSGM compared with the baseline improves by 0.94% and 0.89% for the 20% and 50% training data ratios of RSSCN7 dataset, 1.27% and 0.87% for the 20% and 50% training data ratios of AID dataset, and 1.09% and 0.71% for the 10% and 20% training data ratios of NWPU dataset, respectively. Compared with WSADANSGM, WSADANSGM+SFM improves by 1.65% and 1.32% for the RSSCN7 dataset at 20% and 50% training data ratios, 1.65% and 1.26% for the AID dataset at 20% and 50% training data ratios, and 1.75% and 1.42% for the NWPU dataset at 10% and 20% training data ratios, respectively. In the experiment for scene scale change analysis, the classification accuracy of our method is higher than the baseline at any scale of image, which proves that our method can learn certain image scale information and has strong scale adaptation ability. We use three datasets for remote sensing scene classification, namely, RSSCN7, AID, and NWPU, for the experiments. On the RSSCN7 dataset, the overall accuracies are 91.65% and 94.07% with the training ratios of 20% and 50% for WSADAN-VGG16. For WSADAN-ResNet50, the corresponding accuracies are 92.69% and 94.82%. On the AID dataset, the overall accuracies are 92.78% and 95.18% with the training ratios of 20% and 50% for WSADAN-VGG16. For WSADAN-ResNet50, the corresponding accuracies are 93.73% and 95.88%. On the NWPU dataset, the overall accuracies are 87.01% and 90.44% with the training ratios of 10% and 20% for WSADAN-VGG16. For WSADAN-ResNet50, the corresponding accuracies are 90.71% and 92.63%.The proposed method can learn CNN features at a wider range of scales without manual multi-scale selection for different datasets. The performance of the proposed method is better than that of traditional CNNs, especially for the scene categories containing objects with large-scale variations.
摘要:In recent years, deep learning-based dehazing methods have achieved remarkable results in the field of image dehazing. However, most dehazing methods based on U-shaped networks directly transfer the features of the encoding layer to the corresponding decoding layers, which lacks information interaction between the low- and high-level features. Meanwhile, the network model designed based on the U-shaped structure may destroy the detailed information important for the restored image in the process of downsampling. As a result, the restored clear image lacks detailed texture and structure information. In addition, the dehazing method based on non-U-shaped network has limited receptive field, which hinders its capability to effectively utilize contextual information. As a result, these methods cannot achieve ideal dehazing results in remote sensing images with large scene scale changes. Therefore, this study proposes a two-branch remote sensing image dehazing network based on hierarchical feature interaction and enhanced receptive field. This network includes hierarchical feature interaction sub-net and multi-scale information extraction sub-net. The hierarchical feature interaction sub-net uses the hierarchical feature interaction fusion module to introduce semantic information into low-level features and spatial details into high-level features layer by layer. This way enhances the information interaction between features at different levels in the encoding layer. The multi-scale information extraction sub-net uses the multi-scale residual dilated convolution module to fuse the features of different receptive fields for obtaining contextual information, which is crucial for remote sensing image dehazing. The experiment on two public datasets show that the dehazing method proposed in this study achieves the best evaluation compared with the existing nine excellent dehazing algorithms. Among them, in the three sub-test sets of the public remote sensing dataset Haze1k, the quantitative index PSNR values of this study reach 27.362, 28.171, and 25.137 dB. In the two sub-test sets of the public remote sensing dataset RICE, the quantitative index PSNR values of this study reach 37.79 and 35.367 dB. In addition, the method proposed in this study is the closest to ground truth in terms of subjective visual qualities such as color, saturation, and sharpness, while still achieving the dehazing effect. The following conclusions can be drawn: (1) through the proposed hierarchical feature interaction fusion module, the deep semantic information in the coding stage is gradually interactively fused with the shallow detailed texture information, which enhances the expressive ability of the network and restores clear images with higher quality. (2) Through the multi-scale residual dilated convolution module, the dehazing network proposed in this study can increase the receptive field of the network without changing the size of the feature map. The contextual information of different scales can also be fused. (3) In two public remote sensing image dehazing datasets, namely, Haze1k and RICE, the dehazing method proposed in this study outperforms nine recently proposed excellent dehazing algorithms in terms of objective evaluation indexes and subjective visual effects.
摘要:Cultivated land cover, as an important technical index to reflect the dynamic changes of human activities and the utilization degree of land resources, has been widely utilized in the fields of food security assessment and land management decision making. Existing information extraction methods ignore the differential characteristics of the plots and the rich information found in edge details, which results in fragmented extraction results with fuzzy boundaries. Therefore, an improved model that couples semantic segmentation model and edge enhancement is proposed to better solve the problem of insufficient fitting of cultivated land edges and fully utilize the rich semantic features and edge information in remote sensing images. The edge loss is designed accordingly to further improve the training accuracy and model performance.We design an edge branching self-network formed by CoT unit, gated convolution, and SCSE attention mechanism to realize the information complementarity of edge and depth features. We construct a joint edge enhancement loss function called BE-loss with constraints to enhance the attention of the model to boundary information. On this basis, we construct a cultivated land information extraction model, that is, BECU-net, by combining the EfficientNet backbone network and U-frame. In the multi-feature input layer of this model, the index and texture features of the preprocessed data are pre-extracted, the input structure is adjusted, and the feature expression ability of the network is improved.The extraction accuracy of cultivated land is 94.13%, and the F1-score is 95.17%. Compared with PANet, the extraction accuracy increased by 15.01%, and the F1-score improved by 7.93%. Compared with DeeplabV3+ network, the extraction accuracy is enhanced by 2.03%, and the F1-score is increased by 1.15%. The edge of cultivated land extracted by BECU-Net model is clear, and it is close to the real edge shape of cultivated land. Few holes and islands are observed. The extracted large parcels are not missing, and the edges and corners are sharp. The extracted small parcels have clear outlines and small deformation. At various gaps and complex edges, the extraction effect of GID dataset is significantly improved compared with that of the five other models. The effect is significant when used for edge extraction, The sawtooth and cavity phenomena of cultivated land patches are effectively restrained as well.(1) The input layer of network structure with multiple features, including exponential features and texture features, can effectively reflect the characteristics of cultivated land. (2) The edge branch subnetwork focuses on processing the shape information to better identify the boundary details in the cultivated land image. Its edge features complement the depth features of the Efficient encoder, and they can be cascaded to fully utilize the shallow details. (3) The improved combined loss function called BE-Loss with regular term solves the problem of unbalanced training sample categories and non-edge pixel-dominated loss function. Overall, the algorithm in this study provides a technical reference for further solving the problem of fuzzy boundaries when extracting cultivated land information. It also offers theoretical support for the accurate division of complex boundaries.
关键词:remote sensing;edge enhancement;cultivated land extraction;semantic segmentation;U-Net;high resolution remote sensing image
摘要:Ecosystem is the foundation of human survival and sustainable development. Ecological quality evaluation will provide a scientific basis for formulating policies aimed at ecological protection and restoration, which further promote the construction of an ecological civilization.We constructed a remote sensing evaluation framework for ecological quality. This framework consists of “function-stability-stress” in three dimensions. Then, we used remote sensing data, such as vegetation parameters and land use, to monitor and evaluate the terrestrial ecological quality from 2000 to 2018.The overall ecological quality of the country was in good condition in 2018, with approximately 36.98% of the ecological quality above average and only 4.33% indicating poor ecological quality. Fujian, Hainan, Guangxi, and other provinces showed the best ecological quality. From 2000 to 2018, 53.97% of the area ecological quality showed an improvement trend, and the improvement trend was more obvious after 2011. The ecological quality situation in some regions nationwide is still severe. These areas have low ecosystem function, poor ecosystem stability, and high ecosystem stress.By summarizing the results of the national ecological quality assessment, we can observe that the ecological quality evaluation framework is scientific, simple, fast, and economically feasible. Thus, it can meet the requirements of a rapid assessment of regional or national ecosystem quality.
摘要:Quantitatively estimating the fractional cover of photosynthetic vegetation, non-photosynthetic vegetation (NPV), and bare soil plays an important role in establishing carbon dynamics models. Accurately obtaining the fractional cover of NPV provides the important information for the study of land desertification and vegetation transformation mechanisms. Although some progress has been made in obtaining NPV fractional cover (fNPV) by optical remote sensing in previous studies, many interfering factors and difficulties are still present. We will attempt to combine microwave and optical remote sensing information to obtain NPV fractional cover for further improving the accuracy of the fractional cover estimation of NPV.In this study, we used Minqin County in Gansu Province as the research area, and we employed Sentinel-1B IW GRD and Sentinel-2A as data sources. The experiments employed the control variable method with the linear index model and the random forest regression (RFR) model to conduct the fractional cover estimation of NPV by using microwave and optical remote sensing data. Then, the estimated endmember fractions were validated with reference to fraction measurements. In addition, the Root Mean Square Error (RMSE) and Relative Root Mean Square Error (RMSE%) were employed as indicators to evaluate the inversion accuracy.Results show that (1) using cooperative Sentinel-1 and Sentinel-2 remote sensing data to estimate the fractional cover of NPV can effectively improve the estimated accuracy compared with using Sentinel-2 data alone. (2) The RFR model is an effective method for the fractional cover estimation of sparse NPV, and its estimation accuracy is higher than that of the linear index model. The validation RMSE of the random forest model and the estimated fNPV of the linear index model are 0.0149 and 0.0153, respectively. Obviously, the accuracy of fNPV estimation increases by 1.4% when using the RFR model instead of the linear index model. (3) The VH and VV polarization bands of Sentinel-1 data can effectively detect the characteristics of NPV. Especially, VH band is more sensitive to NPV, and its estimation accuracy is improved by 5.1% compared with that of VV band. (4) The accuracy of fNPV estimation can be improved when soil index is considered in each model, which illustrates that incorporating the soil characteristic information in the models is important for NPV extraction.Overall, the combination of Sentinel-1 and Sentinel-2 remote sensing data can effectively improve the accuracy of the fractional cover estimation of NPV by employing the RFR model. VV and VH polarization modes are sensitive to NPV vegetation detection, especially VH polarization mode. The accuracy of NPV extraction can be further improved by considering the soil index, which reflects the soil characteristics. Therefore, the combination of microwave and optical remote sensing data is an effective method to improve the accuracy of fNPV estimation. Incorporating polarization information with vegetation structure information and soil parameters with soil characteristic information is important for improving the accuracy of the fractional cover estimation of NPV.
关键词:non-photosynthetic vegetation;Sentinel-1;Sentinel-2;linear index model;random forest regression model;VV and VH polarization;Minqin County in Gansu province