WU Tianjun, LUO Jiancheng, LI Ziqi, HU Xiaodong, WANG Lingyu, FANG Zhiyang, LI Manjia, LU Xuanzhi, ZHANG Jing, ZHAO Xin, MIN Fan
Corrected Proof
DOI:10.11834/jrs.20254115
摘要:As a new trend in the development of artificial intelligence (AI), the revolutionary impact of large models (LMs) on scientific research paradigms, production methods, and industrial models cannot be underestimated. Investing in LM research is an inevitable choice. In the field of geographic artificial intelligence (GeoAI), there is still a long way to go between the scientific design and practical application of LMs. This article adheres to the principle of deconstructing complex land surface systems and solving precise land parameters. It proposes to carry out land spatial object-oriented modeling supported by multi-source and multimodal observation data. On this basis, we outline the land spatial parameter system and the solution framework via the integration of five-land-parameters from land use, land cover change, land soil, land resource, land type/application. Furthermore, an intelligent computing remote sensing LM is designed for large-scale parameter solving via integrating three core systems, namely symbol system, perception system, and control system. A preliminary experiment is conducted using the solution of land use parameters in agricultural production spaces as an application case. The practice showed that the proposed framework has great potential in improving the accuracy of large-scale parameter calculation in land space. The proposed model helps to serve the intelligent customization of refined land information products and deepen the understanding of land space. Finally, prospects for LM research on land spatial parameter calculation are presented from the perspectives of model adaptability/robustness, and interpretability/credibility of results.
关键词:large model;geospatial artificial intelligence (GeoAI);land spatial object-oriented modeling;land parameter solving;attention mechanism;deep learning network;agricultural production space
SONG Baogui, SHAO Pan, SHAO Wen, ZHANG Xiaodong, DONG Ting
Corrected Proof
DOI:10.11834/jrs.20253549
摘要:Objective Remote sensing(RS) image building extraction is one of the research hotspots in the field of RS, which is of great significance for urban planning, illegal building detection, natural disaster assessment and so on. With the rapid development of deep learning technology, it has been introduced into remote sensing image building extraction and achieved significant extraction effect. There are two main challenges for building extraction: 1) The buildings in RS images have different scales and shapes, and 2) it’s difficult to accurately extract building boundaries. Method To address the above two challenges, this paper proposes a two-branch building extraction network integrating main body and edge separation and multi-scale information extraction. First, a main body and edge separation branch (MBESB) is designed for feature decomposition based on the decoupling idea and optical flow estimation technique. MBESB generates the main body and edge features of buildings respectively, thereby enhancing the ability of representing the building boundaries. Then, to fully extract the different-scale features of buildings, a lightweight multi-scale information extraction branch is constructed based on dilated convolution, depth-separable convolution and attention mechanism. Finally, in order to improve the training process of building extraction network, a body-edge-feature-enhanced loss function is presented with the help of the generated main body and edge features. Result Experiments were carried out with two public building extraction datasets, namely the Inria and WHU datasets, in order to evaluate the performance of the proposed MMT-Net method. Five deep learning methods were used as the comparative methods. Quantitative analysis of building extraction results was done on four evaluation metrics, namely precision, recall, F1, and IoU. For the Inria and WHU datasets, the F1/IoU values of the proposed MMT-Net are 0.8894/0.8008 and 0.9567/0.9170, respectively, which are superior to the five comparative methods. Conclusion Experimental results on two commonly used public building extraction datasets show that the proposed building extraction network is feasible and effective. In addition, the ablation experiments’ results indicate that all of the MBESB, the LMIEB and the loss function with auxiliary enhancement of the main body and edge features proposed in this work can enhance the building extraction performance effectively.
关键词:remote sensing image;building extraction;deep learning;U-Net;main body and edge separation;two-branch;multi-scale;lightweight
ZHAO Zisheng, HAO Xiaohua, REN Hongrui, LUO Siqiong, DAI Liyun, SHAO Donghang, FENG Tianwen, ZHAO Qin, JI Wenzheng, LIU Yan
Corrected Proof
DOI:10.11834/jrs.20253540
摘要:High spatiotemporal resolution snow depth data is crucial for hydrological modeling and disaster forecasting. Currently, high temporal resolution snow depth data is typically derived from passive microwave measurements, but its coarser spatial resolution cannot meet the needs of regional hydrology and disaster research. Based on passive microwave brightness temperature data and combined with high-resolution optical remote sensing data, this paper aims to develop a high-precision downscale snow depth inversion algorithm to provide high spatial and temporal resolution snow depth data for regional-scale hydrology and climate research.This study proposes a downscaling snow depth retrieval algorithm based on multi-source remote sensing data such as passive microwave and optical, coupled with a deep learning model (FT Transformer) and a snow microwave radiative transfer (SMRT) model. Deep learning is used to map the complex nonlinear relationship between features such as AMSR 2 Brightness Temperature Deviation (TBD), Snow Cover Days (SCD) and Snow Cover Fraction (SCF) and snow depth, At the same time, considering the influence of the physical properties of snow, the coupled SMRT is fitted with the effective snow grain size (ESG) to characterize the spatiotemporal dynamic snow properties, and is input into the deep learning model to achieve downscale inversion of snow depth.This algorithm was used to obtain downscaled snow depth data at 500 m spatial resolution in northern Xinjiang.Model training and validation were conducted using observed data from 39 stations in northern Xinjiang. The validation results revealed that Snow Cover Days (SCD) can effectively represent the snow accumulation process. Independent validation showed an 18% improvement in RMSE, indicating enhanced spatial generalization capability of the model. The inclusion of the Effective Snow Grain (ESG) feature significantly improved the overall accuracy of the deep learning-based downscaled snow depth retrieval, resulting in an RMSE of 6.82 cm. This represents a 15% improvement compared to the model without the ESG feature. Additionally, the inclusion of the ESG feature greatly mitigated the underestimation of deep snow (>40 cm), leading to a 35% improvement in RMSE for such conditions. Furthermore, a time series analysis of the snow depth retrieval using the ESG feature demonstrated that it aligns with the observed snow depth variations, thereby constraining and stabilizing the output of the FT-Transformer model. Finally, when compared to existing snow depth products such as AMSR2, ERA5-Land, and SDDsd, the downscaled snow depth data from this study exhibited superior validation accuracy, with an RMSE of 6.51 cm. The spatial distribution of snow depth was also more refined, particularly capturing the complex snow depth heterogeneity in the mountainous regions of northern Xinjiang.This study explored the feasibility of combining the Snow Microwave Radiative Transfer (SMRT) model with deep learning for downscaled snow depth retrieval, It has obtained a downscaled snow depth product with high accuracy performance in northern Xinjiang, providing assurance for the demand of high spatiotemporal resolution snow depth data at the regional scale.
WU Xiaodan, WEN Jianguang, XIAO Qing, LIN Xingwen, YOU Dongqin, YIN Gaofei, LIU qinhuo
Corrected Proof
DOI:10.11834/jrs.20244296
摘要:(Objective)Ground observation is the foundation of remote sensing scientific research, providing important data support for the construction of quantitative remote sensing models, accurate and efficient inversion of remote sensing information, and validation of remote sensing products. In particular, with the entrance of era of artificial intelligence, ground observation has been combined with satellite data to drive deep learning models, generating remarkable research results in the field of remote sensing. However, with the combination of satellite data with ground observations, uncertainty is unavoidably introduced to the subsequent results and analysis. This is resulted from the representativeness errors of ground observation partly due to the scale differences between ground observations and satellite pixels and partly due to the complex spatial heterogeneity land surface itself. Ground observation only represents the true value of the measured object at the observation time and in the space it represents, but cannot be directly used as the true value at the scale of satellite pixels.(Method)How to improve the spatiotemporal representativeness of ground observations on satellite pixel scales and obtain the closest representation of reality has alway been the key issue in the field of remote sensing experiments. The acquisition of pixel scale ground truth involves the selection of sample areas, evaluation of spatial heterogeneity, optimization of ground sample layout, ground observation, and scale conversion. Although a large amount of research has been carried out for each aspect, there are still cases of conceptual ambiguity and insufficient understanding in each link, resulting in significant uncertainty in obtaining pixel scale ground truth. How to constrain and control the uncertainty of the pixel scale ground truth in the acquisition process and how to obtain the pixel scale ground “truth” with minimum uncertainty is currently a bottleneck problem that urgently needs to be solved. This article discusses the current challenges and possible solutions in obtaining pixel scale ground “truth”, aiming to provide new insights and theoretical guidance for remote sensing field observation experiments.(Result)Large spatial heterogeneity does not necessarily mean poor spatial representativeness of ground observations. Because representativeness error is not only related to spatial heterogeneity, but also to factors such as the number, location, and observation scale of ground stations. Spatial heterogeneity is the dominant factor affecting the representativeness error of ground observations without optimizing sampling. But it is almost unrelated to spatial representativeness error when the sampling was optimized. It is noteworthy that spatial heterogeneity show strong dependence on spatial scales. At a smaller scale, spatial heterogeneity caused by random factors cannot be ignored. As the sub-pixel scale increases, spatial heterogeneity is mainly influenced by structural factors. The influence of geolocation mismatch needs to be fully considered, whose effect can be eliminate by developing the methods to identify the exact spatial extent of validation pixel.(Conclusion)High-quality ground observation data and effective scale conversion methods are essential prerequisites for obtaining high-quality ground "truth" at the pixel scale. However, there is still a lack of high-precision scale conversion methods, especially for complex terrains such as mountainous regions. In terms of ground observations, it is not only necessary to establish a high-quality observation network but also to ensure effective collaboration among different networks, instruments, observation techniques, and data managers to construct a ground observation dataset with a unified quality standard. In terms of scale conversion, there is a need to develop more universal and accurate scale conversion models, aiming to fully utilize ground observation data globally to construct high-quality remote sensing pixel "truth" datasets.
摘要:Real hyperspectral images (HSI) are vulnerable to high intensity mixed noise, and how to accurately model the noise is very important in the subsequent processing tasks. The method of asymmetric Laplacian noise modeling has achieved a good effect in removing mixed noise, and has been widely studied and applied in the field of HSI denoising. This method takes into account the heavy tail and asymmetry of noise, and models different noises in different bands. However, these methods ignore the inherent distribution characteristics of HSI gradient base space, and can not retain the edge information and details of the image well, resulting in poor restoration effect.Considering that both noise and gradient basis spacehave heavy tail and asymmetry, an asymmetric (AL) model of noise and gradient basis space is established, and basis space asymmetric Laplacian total variational (BSALTV) hyperspectral image denoising model is proposed. Among them, the gradient base spacefully retains the prior information of the original HSI gradient map, which can better reflect the sparse prior distribution characteristics of the gradient, showing a unique asymmetric distribution in different bands. In addition, by exploring the asymmetric distribution of gradient basis and noise, the global low-rank information and noise distribution characteristics of different bands of the image are accurately mined to avoid excessive smoothing, and the correlation between spatial dimension and spectral dimension is utilized to improve the information retention ability in the process of denoising.The alternate direction multiplier algorithm was used to solve the model, and experiments were carried out on the simulated data set (Pavia and DC) and the real data set (urban) to verify the effectiveness of the proposed method in hyperspectral image denoising. In order to verify the performance of the proposed method, five existing HSI denoising methods are selected for comparison, respectively quantitative comparison and visual comparison. In the quantitative comparison, the PSNR and SSIM values obtained by the proposed method on the simulated data set are optimal in most cases, which fully proves the robustness of the proposed method in the HSI denoising task. In the visual comparison, by comparing the recovery effect diagrams and spectral characteristic curves of various comparison methods, the proposed method not only retains a clearer structure and sharp edge, but also realizes a more coherent spectral information reconstruction, and shows better performance in preserving local details.A BSALTV model for HSI mixed noise removal is proposed. By mining the deep structure information of HSI gradient base space and different noise patterns in different bands, the sparse prior distribution characteristics of gradients are better reflected, excessive smoothing is avoided, image edges and details are preserved, and the local smoothness of HSI is improved to ensure sparsity. Compared with other methods, the proposed method is superior to other methods both in terms of synthetic data and actual data.
摘要:According to the second law of geography, spatial heterogeneity or non-stationarity in spatial data and relationships has increasingly drawn more and more attentions in spatial statistics. To explore this fundamental phenomenon, place or location-specific methods and local statistical techniques that assume data relationships to be spatially variant have been extensively developed. In line with the principle of spatial dependence depicted by the first law of geography, the GW regression technique was first proposed to incorporate spatial weights into location-wise regression model calibrations to highlight spatial heterogeneities in data relationships by outputting spatially varying coefficient estimates. With this distance-decaying schema for calculating spatial weights, a series of geographically weighted (GW) models emerge to conduct fine-scaled spatial analysis in terms of descriptive, explanatory, interpretive and predictive scenarios, including GW descriptive statistics, a number of basic GW regression and extensions, GW discriminant analysis, GW principal component analysis, GW machine learning, GW artificial neural network. These GW models form a continually evolving technical framework for identifying spatially non-stationary features or patterns in a variety of disciplines or fields, including geography, social science, biology, public health, and environment science.In this study, we tried to systematically sort out the theoretical and technical framework of GW models. First of all, we summarized the essence and rules for applying the family of GW models, i.e. catering for spatially heterogeneous or non-stationary features and relationships in geographic variables, outputting location-dependent metrics or estimates via calculating spatial weight matrix the distance-decaying principle of spatial dependence presented by Tobler's First Law of Geography. As common and fundamental parts of the GW models, we introduce the hypothesis tests of spatial heterogeneity or non-stationarity, general definition of distance metrics in geography, calculation of spatial weights and bandwidth optimization.Regarding descriptive, explanatory, interpretive, and predictive scenarios, the potential usages of each individual GW model are also discussed from four analysis levels. We recommend univariate GW descriptive statistics, e.g., GW average, GW quantile, GW standard deviation, and GW Skewness, to facilitate users' grasping the spatially heterogeneous distribution of a geographic variable. For exploratory data analysis with multivariate spatial data, GW correlation coefficient and GW principal component analysis could be preferable. GW regression and its rich extensions, specifically multiscale GW regression, provide powerful tools in interpretive analysis and have been widely applied. With data relationships studied comprehensively, accurate predictions usually appear as an ultimate target in data analytics. The usages of GW regression and geographically and temporally weighted regression are straightforward for predictions, and the prediction accuracy is further improved when the artificial intelligence technologies are incorporated, e.g. GW machine learning, GW artificial neural network and geographically neural network weighted regression.The increasing popularity of GW models has resulted in the development of several software packages, standalone programs and toolkits, including the R package GWmodel and GWmodelS, a new, free, user-friendly and high-performance standalone software that incorporates spatial data management and mapping tools as well as the GW model functions. However, there is still a long way to go before GW models being an all-around quantitative analytical framework for spatial heterogeneity due to drawbacks in theoretical foundation, technical completeness and complementarity, and their evolutions to spatio-temporal dimensions.
摘要:The leaching mining of ion-adsorbed rare earth ore primarily employs in-situ leaching, pile leaching, and pool leaching methods, which result in significant soil pollution. This pollution presents serious environmental challenges, particularly affecting the growth and survival rates of reclaimed vegetation in rare earth mining areas. The restoration of reclaimed vegetation is crucial for mitigating environmental damage and restoring ecological balance. However, the application of intelligent technology to monitor and manage the health and growth of reclaimed vegetation in these mining areas encounters substantial challenges due to the complexities of the natural environment.Unmanned aerial vehicle (UAV) remote sensing image technology has emerged as a promising tool for monitoring and evaluating ecological restoration efforts in rare earth mining areas. The UAV can rapidly capture high-resolution images over large areas, facilitating efficient monitoring of reclaimed vegetation growth in these regions. However, the uneven spatial distribution, varying shapes and diverse overall characteristics of reclaimed vegetation present significant challenges for achieving high-precision automatic recognition from UVA images. Consequently, relying solely on traditional image processing technique for vegetation detection and classification proves to be be inadequate. To address these challenges and enhance the automatic recognition and localization capabilities of individual reclaimed vegetation in UAV images, this paper proposes a method for reclaimed vegetation detection in rare earth mining areas (YOLOv8n), which integrates the global feature YOLOv8-AS. This method represents an innovative improvement over YOLOv8n: first, the downsampling module ADown is introduced to optimize the feature convolution operation, thereby reducing the feature loss during the deep model training process. Second, the SPPF-GFP (Spatial Pyramid Pooling Fast - Global Feature Pool) module is employed for feature extraction, significantly enhancing the detection capability of reclaimed vegetation with substantial variations in overall features.The results showed that in the self-constructed rare earth mining reclamation vegetation dataset, YOLOv8-AS outperforms YOLOv8n by 1.6% and 2.4% in terms of mAP@0.5 and mAP@0.5-0.95, respectively. Compared to YOLOv8n, the model size, number of parameters, and floating point computation of YOLOv8-AS decreased by 11%, 10%, and 9%, respectively. The mAP@0.5 and mAP@0.5-0.95 for the YOLOv8-AS algorithm are 91.1% and 46.8%, respectively. When compared to SSD, Faster R-CNN, RT-DETR, YOLOv5, YOLOv7 and YOLOv7-TINY models regarding mAP@0.5, YOLOv8-AS shows improvements of 14.07%, 23.32%, 1.2%, 2.3%, 3.3%, 2.9% and 1.2%, respectively. According to the comparative experimental results of YOLOV8-AS and YOLOv8 across three scenarios—characterized by a predominance of small targets, simplicity, and complexity—the mAP@0.5-0.95 of YOLOV8-AS increased by 2.3%, 1.2%, and 3%, respectively, when compared to the baseline model YOLOv8. Furthermore, we applied YOLOv8-AS to the reclamation vegetation detection task in a larger scene within a rare earth mining area. The visualization results indicate that, regardless of the scenario—whether featuring numerous small targets, simple scenes, or complex environments—this method significantly enhanced its capacity to identify and accurately locate individual plants in the reclamation vegetation. This finding further substantiates its efficacy in accurately detecting reclaimed vegetation across various conditions. Such advancements are crucial for effectively monitoring the progress of ecological restoration in mining areas and provide essential support for achieving sustainable mining development.
LIU Wangjun, CHEN Yiping, WANG Chaolei, ZHANG Wuming, WANG Cheng
Corrected Proof
DOI:10.11834/jrs.20244054
摘要:“Objective” Tropical mangroves are one of the most productive and biodiverse forest resources but face several challenges such as monoculture planting structures, low ecosystem quality, low survival rates of planted mangroves, and threats from extreme weather and pests. Conducting surveys on mangrove forest resources provides essential data for the scientific management and conservation of these resources. Accurate segmentation of individual trees is a prerequisite for the inventory of such forest resources. Terrestrial laser scanning (TLS) can provide massive, high-precision, and high-resolution 3D point cloud data. However, the point cloud data are characterized by irregularities, varying densities due to distance, and incompleteness due to occlusions. Furthermore, the mangrove scene is complex with interlaced large and small trees and tree occlusions, making precise individual tree segmentation a considerable challenge. Traditional methods such as local maximum detection based on Canopy Height Models (CHM), have demonstrated good performance in simple plot scenarios. However, in the complex canopy interwoven environments of mangroves, where the upper canopy features are weak, these methods are less effective. Currently, there is a lack of research on individual tree segmentation algorithms for mangroves based on TLS point clouds. To address these issues, we aim to propose an individual tree segmentation algorithm applicable for complex mangrove scenes.“Method” This study innovatively combines deep learning and traditional algorithms to propose a high-precision individual tree segmentation framework for TLS point clouds in complex mangrove scenes. The framework initially employs the deep learning network RandLA-Net for ground filtering and wood-leaf separation. Subsequently, mangrove main stems are segmented using a connected component segmentation method. Finally, individual tree segmentation is achieved through the multiple tree tops constraint module.“Results” To assess the accuracy of the algorithm, we use three measures: completeness, correctness, and accuracy. We also conduct a comparative analysis with two classical algorithms. The experimental results demonstrate that the completeness of the proposed method across different mangrove plots is greater than 0.85, with an average of 0.90; the correctness of the proposed method is greater than the two classic algorithms in four plots; the mean accuracy of the proposed method in different sample plots reaches 0.87, which is significantly higher than the two classic algorithms, thus proving the effectiveness and reliability of our method.“Conclusion” This paper proposes an individual tree segmentation framework for TLS point clouds in complex mangrove scenes. Seven sample plots with various data characteristics were annotated to assess accuracy. The experimental results show that, compared to other algorithms, the proposed method achieved the highest accuracy. Despite the differing characteristics of the sample plots, the overall accuracy of the proposed method exceeded 0.8, demonstrating its effectiveness and robustness.
关键词:tropical mangroves;terrestrial laser scanning;point cloud;individual tree segmentation;deep learning
Zhang Bei, Hu XiuQing, Zhou WeiWei, Sha Jin, Chen Lin
Corrected Proof
DOI:10.11834/jrs.20243528
摘要:Objective The advanced geostationary orbit radiometer (AGRI) of FY-4A satellite has been on orbit for 6 years, and the radiation performance of some reflective channels have significantly degraded, affecting the accuracy of quantitative remote sensing product applications. On-orbit vicarious calibration methods based on deep convective cloud (DCC) targets can track and correct the radiometric response of spaceborne optical sensors for attenuation. This method relies on large-sample statistical analysis, and it is significantly to conduct sensitivity studies on factors influencing the calibration accuracy and stability in this method and establish optimal solutions.Method The procedural of the fundamental DCC calibration and tracking method are as follows: initially, extract DCC target pixels from FY-4A/AGRI L1 level data, calculate the reflectance of the target pixels, and apply anisotropic correction using the DCC Angle Distribution Model (ADM). Subsequently, construct daily or monthly Probability Density Functions (PDF) of DCC reflectance and track the trend in peak reflectance (also known as mode) or reflectance mean to monitor and evaluate the radiometric performance of the FY-4A/AGRI instrument. In order to improve the calibration accuracy and stability, the sensitivity research scheme for infrared brightness temperature threshold, pixel uniformity conditions and DCC angle distribution model (ADM) was proposed. Lastly, correct the DCC model and establish an optimal solution according to the results of the sensitivity analysis.Result The results indicate that for the infrared brightness temperature threshold, the sensitivity of DCC mean reflectance is lower than that of probability density function (PDF) peak reflectance in the visible light channel, and in the short-wave infrared channel, the sensitivity of DCC PDF peak reflectance is slightly lower than that of reflectance mean. In the visible-near-infrared band, the CERES ADM model can better correct the effect of DCC reflectance anisotropy, and is significantly better than the Hu model. However, neither of the two ADM models has obvious correction effect in the short-wave infrared band. Based on the above sensitivity studies, the threshold selection and ADM correction strategy in the DCC method are determined. The radiation response of FY-4A/AGRI reflected bands from March 2017 to April 2023 is tracked and evaluated. The results show that the radiation response of 0.47μm, 0.65μm and 2.25μm channels degrades significantly, with the total attenuation rates of 45.55%, 26.22% and 6.362%, respectively. This result provides a reference for updating the AGRI operation calibration coefficient.Conclusion The paper conducted a sensitivity analysis on the key factors in the radiometric calibration tracking method based on Deep Convective Clouds (DCC) for satellite optical sensors, enhancing calibration accuracy and stability through the establishment of an optimal solution. By utilizing optimization methods, it quantitatively evaluated the variations in radiometric response performance in the reflectance band of FY-4A/AGRI, providing valuable reference for updating the operational calibration coefficients of this instrument.
关键词:remote sensing and sensors;radiometric calibration;Deep convective cloud;advanced geostationary radiation imager;angular distribution model;top of atmosphere reflectance;reflective solar bands
CAI Shangshu, KONG Dan, SI Lin, ZHANG Keshu, LIU Qingwang, ZHANG Qingjun, LI Zhen, QI Zhiyong, SUN Hua, PANG Yong
Corrected Proof
DOI:10.11834/jrs.20244172
摘要:Sharing multi-platform light detection and ranging (LiDAR) point clouds of forests is of great significance for LiDAR remote sensing research and applications in forestry. To this end, the Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, has constructed a multi-platform LiDAR point cloud dataset for forest plots in subtropical regions, featuring airborne laser scanning, unmanned aerial vehicle laser scanning, and terrestrial laser scanning (TLS) point clouds, along with forest inventory data. The dataset was collected at Gaofeng Forest Farm in Guangxi, China, covering 25 plots with three tree species: Eucalyptus, Chinese fir, and Pinus massoniana. The field forest inventory data include plot locations, tree positions, diameter at breast height, tree height, height to the first live branch, and crown width. The dataset enables analysis of forest three-dimensional structural information captured by LiDAR from various platforms, evaluating automated processing algorithms like point cloud registration and tree segmentation. It provides important references for forest research at regional, plot, and tree levels. Additionally, this study developed a ground survey method guided by TLS data. This method utilizes tree stem point clouds to mark tree positions and measures individual trees according to tree maps, improving operational efficiency.
摘要:“Objective” Photovoltaic power stations, which generate electricity using solar radiation, are an important form of clean energy. However, current research generally lacks in-depth investigation into the long-term land use evolution of photovoltaic power stations and has not systematically predicted photovoltaic construction under different policy scenarios. Therefore, it is of significant scientific importance to thoroughly explore the changes in area, type conversion, and spatial distribution characteristics of photovoltaic power stations in Ordos from 2000 to 2023, and on this basis, to predict future land use types under different policy scenarios. “Methodology” To achieve these objectives, this study utilized visual interpretation techniques on satellite imagery from Landsat 5 and GF2, covering the years 2000 to 2023. From these images, land use type maps were generated, which enabled us to track changes in land use over time. Specifically, we examined the spatiotemporal characteristics of photovoltaic power stations, conducting an analysis every five years to determine shifts in spatial patterns. This was done using Gaussian projection ellipses, a method that allowed us to capture the spatial distribution trends of these stations. In addition to this spatial analysis, we employed the PLUS (Patch-generating Land Use Simulation) model, which integrates both natural and socio-economic driving factors to predict future land use patterns under different policy conditions. Key driving factors included population growth, surface temperature, soil heat flux, precipitation, and changes in policy, which are critical elements in understanding the evolution of photovoltaic stations over time and their future development. “Results” The findings of this study are multifaceted and provide valuable insights into the evolution of photovoltaic power stations in Ordos. (1) The overall spatial pattern of land use remained relatively consistent between 2000 and 2011, as well as between 2011 and 2023. However, a noticeable shift occurred starting in 2011, when certain land types, such as desert sand and grassland, began to be converted into photovoltaic power station sites. (2) From 2011 to 2023, there was a clear shift in the spatial distribution of these stations, with the main area of photovoltaic development moving from the northwest to the northeast of Ordos. Additionally, the types of land being used for these constructions evolved, with an increasing trend of converting grassland areas for photovoltaic station use. (3) The analysis using the PLUS model revealed that several key factors were driving these land use changes, including population growth, surface temperature, soil heat flux, precipitation, and, most notably, policy decisions. Policy, in particular, emerged as one of the strongest determinants in the development and expansion of photovoltaic stations in the region. (4) Projections for land use changes in 2030 under three different policy scenarios show that, regardless of the specific scenario, areas allocated to buildings, forests, water bodies, arable land, grassland, and photovoltaic power stations will likely continue to expand. These findings provide important insights into the future changes of photovoltaic power stations in Ordos. “Conclusion” This study sheds light on the spatial and temporal dynamics of photovoltaic power station development in Ordos, highlighting the complex interplay between population growth, environmental factors, and policy decisions in shaping land use changes. The results not only demonstrate how land use has evolved over the past two decades but also provide predictive insights into how it may continue to change by 2030 under different policy scenarios.
关键词:Photovoltaic power stations;spatial pattern;spatio-temporal variability;driving factors;scenario simulation
Chen Jiyi, Li Guoyuan, Peng Jun, Liu Zhao, Zhou Xiaoqing
Corrected Proof
DOI:10.11834/jrs.20244186
摘要:The Terrestrial Ecosystem Carbon Inventory Satellite (TECIS) is China's first remote sensing satellite with space-borne LiDAR as the main payload, which aim at quantitatively monitoring of terrestrial ecosystem carbon storage, forest resources and forest productivity, and serving the goals of "carbon peaking and carbon neutrality" and the monitoring and evaluation of major projects for the protection and restoration of important ecosystems in China. In this paper, the ability of different relative height (RH) metrics to characterize the forest canopy height was evaluated in detail for the full waveform data of the multi-beam LiDAR onboard the Terrestrial Ecosystem Carbon Inventory Satellite. Canopy heights were calculated using RH0, RH5, RH10, RH15, RH20 and last peak locaton as the benchmark. Different percent canopy height determined by RH100, RH95, RH90, RH85 and RH80 were obtained and compared to canopy height model(CHM) generated by airborne LiDAR data to explore the correlation between RH percent canopy height and CMH. The ability in canopy height detection between fixed-gain and variable-gain full-waveform was compared. Moreover, the influence of slope on canopy height extraction was analyzed. Six tracks of multi-beam LiDAR L2B data products from the Terrestrial Ecosystem Carbon Inventory Satellite passing the test area of temperate coniferous-broadleaved mixed forest in Quebec, Canada, were selected for analysis. The results show that the canopy height is overestimated significantly when RH0 is used as the benchmark of canopy height calculation, and the accuracy of canopy height is improved by increasing the background noise threshold with reference to the noise standard deviation, but the effect is limited. As for the fixed-gain waveforms, using RH5 as the benchmark, and identifying the threshold of background noise by 6 times the standard deviation of background noise, the accuracy of different percentage of forest canopy heights achieves RMSEs of between 3.58 m~4.23 m, MEs of less than 1.0 m and MAEs of between 2.52 m~3.21 m, respectively. The accuracy of retrieval canopy heights using variable-gain and fixed-gain full waveform data is comparable, but the times of background noise standard deviation for calculating background noise threshold is smaller than that of fixed gain. Start from RH5 and using proper background noise threshold, the canopy heights retrieved by RH100, RH98, RH95, RH90 and RH85 from both variable-gain and fixed-gain full waveform data were close to the 100% , 95% , 90% , 85% and 80% canopy height from CHM products within the footprint range, respectively. Moreover, using RH5 as the benchmark of canopy height is significantly better than the last peak position derived from waveform decomposition, and is less affected by terrain slope. The configuration of variable and fixed gains is beneficial for enhancing data effectiveness in forest areas. The conclusions will be helpful for the application of the laser altimetry data of Terrestrial Ecosystem Carbon Inventory Satellite in forest canopy height retrieval.
DONG Wenqian, WANG Hao, QU Jiahui, Hou Shaoxiong, LI Yunsong
Corrected Proof
DOI:10.11834/jrs.20244200
摘要:Objective Hyperspectral image classification, which aims to assign a belonging category to each pixel in a hyperspectral image, is an important application in the field of remote sensing. In recent years, contrastive learning has been widely used in hyperspectral image classification tasks due to its good ability to mine key features of data. However, most of the current self-supervised contrastive learning paradigms use a two-stage scheme to train the network, and it is difficult to avoid defining objects in the same class as negative samples in the pre-training stage, which often leads to a wider intra-class gap. In addition, contrastive learning algorithms generally use data enhancement methods such as cropping and rotating to generate positive samples, and the diversity of generated positive samples is more limited. In this paper, a hyperspectral image classification network based on multi-scale supervised contrastive learning is proposed to solve the above problems. The method aims to extract multiscale spatial features and spectral features level by level, and adaptively fuse the features to generate the final results.1. Method This paper proposes a Multiscale Supervised Contrastive Learning Network (MSCLN) for hyperspectral image classification, which includes two parts: a multiscale contrastive feature learning network and a spatial-spectral hybrid probability-directed fusion classification network. In the multiscale contrast feature learning network, a spectral-guided branch and a spatial feature extraction branch that introduces an attention mechanism are designed to extract spectral-spatial features level by level. Then, two multi-scale spatial features of the same object are constructed as positive samples by introducing label information. Specifically, 2n views can be generated for n objects, in which all views of the same kind of objects are positive samples of each other and the rest are negative samples. Finally, in the spatial-spectral hybrid probability-directed fusion classification network, the learnable parameters are set to integrate the spectral-spatial features to obtain the final classification probability. By co-training the two networks, more accurate classification results can be obtained.2. Result In three public hyperspectral datasets, Houston 2013, WHU-Hi-LongKou and Pavia University, 50, 80 and 50 labeled samples were randomly selected from each category as training sets, respectively. The overall classification accuracy of the proposed algorithm reached 96.20%, 99.20% and 98.96%, respectively, and the classification performance was better than that of the other comparison methods.3. Conclusion The method extracts the discriminative spectral-spatial features hierarchically by MSCLN, and introduces the labelling information to construct the two multiscale spatial features of the same object as positive samples. It makes the same kind of sample distance more aggregated while pushing away the inter-class distance. Finally adaptive fusion of spectral-spatial features to obtain an excellent classification map.4.
ZHANG Shurong, FU Bolin, GAO Ertao, JIA Mingming, SUN Weiwei, WU yan, ZHOU Guoqing
Corrected Proof
DOI:10.11834/jrs.20243515
摘要:(Objective)Mangroves are one of the most biodiverse and productive marine ecosystems, and the fine classification of mangrove communities by combining high-resolution remote sensing images and deep learning has become a hot and difficult topic in current research.(Methods)In this paper, we proposed a novel deep learning classification network model SSAFormer (Swin-Segmentation-Atrous-Transformer) for fine classification of mangrove communities. The SSAFormer used Swin Transformer, a variant of Visual Transformer, as the backbone network. The Atrous Spatial Pyramid Pooling (ASPP) in the Convolutional Neural Network (CNN) architecture was added to the backbone network to extract more scale feature information. The Feature Pyramid Network (FPN) structure was embedded in the lightweight decoder to fuse the rich semantic feature information of the low and high layers. In this paper, three active and passive feature datasets were constructed based on GF-7 multispectral imagery and UAV-LiDAR point clouds, and the classification results of the improved Swin Transformer and SegFormer algorithms were compared and analyzed to further demonstrate the classification performance of the SSAFormer algorithm for mangrove communities.(Result)The results of the study revealed that:(1) Compared with the improved Swin Transformer and SegFormer algorithms, SSAFormer achieved a fine classification of mangroves, with an overall accuracy (OA) increase of 1.77%~ 5.3%, Kappa up to 0.8952, and a mean intersection over union (MIou) was improved by 7.68%;(2) On the GF-7 multispectral dataset, the SSAFormer algorithm achieved the highest overall accuracy (OA) of 91%, and the mean intersection over union (MIou) of the SSAFormer algorithm on the UAV-LiDAR dataset improved to 57.68% on the UAV-LiDAR dataset with the inclusion of spectral features. The mean value of the SSAFormer algorithm mean intersection over union (MIou) improved by 1.48%;(3) The UAV-LiDAR data showed a maximum improvement of 5.35% in the mean intersection over union (MIou) compared to the GF-7 multispectral data, a mean improvement of 1.81% in the overall accuracy(OA), and an improvement of 2.6% in the classification accuracy (F1-score) of the UAV-LiDAR data with the inclusion of spectral features;(4) Based on the SSAFormer algorithm, the highest classification accuracy (F1-score) of 97.07% was achieved for Avicennia marina, the classification accuracy (F1-score) of Aegiceras corniculatum achieved 91.99%, the classification accuracy (F1-score) of Sporobolus alterniflorus reached 93.64%, and the average value of classification accuracy (F1-score) of Aegiceras corniculatum reached the highest 86.91% on SSAFormer model.(Conclusion)The above conclusions proved that the proposed model can effectively improve the classification accuracy of mangrove communities.
关键词:mangrove;GF-7 multispectral;UAV-LiDAR point clouds;SSAFormer;deep learning;Active and passive image combination;feature selection;Fine classification of community
SHANG Jian, DOU Fangli, LIU Lixia, YUAN Mei, YIN Honggang, SUN Ling, HU Xiuqing
Corrected Proof
DOI:10.11834/jrs.20242677
摘要:The Wind Radar (WindRAD) onboard Fengyun-3E (FY-3E) meteorological satellite is the first active remote sensing instrument of China's Fengyun series satellites and the first spaceborne dual frequency & dual polarization scatterometer in the world. Spaceborne scatterometer is important remote sensing instrument for measuring meteorological and ocean parameters, which obtains geophysical parameters such as wind speed and wind direction on the global ocean surface through backscattering measurement of the earth system. WindRAD uses the advanced fan beam with conical scanning system, mainly aiming at measuring the sea surface wind vector all weather and all day with high precision as well as high resolution. In addition, the WindRAD can also measure soil moisture, sea ice and other geophysical parameters. This paper aims to give the preliminary evaluation of in-orbit performance for the WindRAD. The observation principle, signal characteristics and main performance indicators of the WindRAD are introduced, and the detailed data preprocessing method is proposed, that is, the level 1 processing to generate backscattering coefficient of global land and sea surface. According to WindRAD’s in-orbit test after the launch in 2021, the performance of the instrument is preliminarily analyzed. Key telemetry parameters including rotation speed, internal calibration value and temperatures of important components are analyzed. Azimuth resolution, range resolution, observation swath width, radiometric resolution, and internal calibration accuracy are evaluated using WindRAD actual remote sensing data as well as parameters measured before the launch. The analysis results show that WindRAD works steadily in orbit, all of the performance indicators meet the expectations, and can provide high-quality backscattering coefficient data in both C and Ku bands for product retrieval. This work paves the way for WindRAD remote sensing application, assimilation application and weather forecast. WindRAD observation data is received and processed in FY-3E satellite ground system. The operational data is public to the users worldwide and can be obtained from the FENGYUN Satellite Data Center of National Satellite Meteorological Center, China Meteorological Administration (http://satellite.nsmc.org.cn/PortalSite/Data/DataView.aspx).
摘要:Multispectral remote sensing image has rich spectral information that can reflect ground features, but its spatial resolution is low and its texture information is relatively insufficient. By contrast, panchromatic remote sensing image has high spatial resolution and rich texture information, but lacks rich spectral information that can reflect ground features. In practice, two kinds of images can be integrated into a single one to obtain the complementary advantages from the different images, thereby the fused image can better meet the needs of downstream tasks. To this end, this article proposes an unsupervised method for fusing the panchromatic and multispectral images using dual-branch generative adversarial network combined with Transformer.Specifically, the source images (source panchromatic and multispectral images) are firstly decomposed into base and detail components using guided filtering, where the base component mainly focuses on the main body of the source image, and the detail component mainly represents the texture and detail information of the source image; Next, concatenates the decomposed base components of the panchromatic and multispectral images, and also concatenates the decomposed detail components of the two kinds of source images; Then, respectively inputs the concatenated base and detail components into the base and detail branches of the dual-branch generator; Next, according to the different characteristics of the base and detail components, respectively utilizes the Transformer and CNN to extract the global spectral information from the base branch and the local texture information from the detail branch; Then, continuously trains the model in an adversarial manner between the generator and the dual discriminators (base layer discriminator and detail layer discriminator), and finally obtains the fused image with rich spectral information and high spatial resolution. Extensive experiments on the public dataset demonstrate that the proposed method outperforms the state-of-the-art methods both in qualitatively visual effects and in quantitatively evaluated metrics.This article proposes an unsupervised fusion method for panchromatic and multispectral remote sensing images using dual branch generative adversarial network combined with Transformer. The superiority of the proposed method was verified via qualitative and quantitative comparisons with multiple representative methods. In addition, the ablation studies further confirm the effectiveness of the network structure designed in this article.
WANG Yifan, HUANG Xian, WANG Jianlin, ZHOU Tong, ZHOU Wenjun, PENG Bo
Corrected Proof
DOI:10.11834/jrs.20243358
摘要:(Objective)Shadows in remote sensing imagery play a crucial role in image interpretation and feature extraction but are also known to introduce significant challenges in image analysis. Traditional methods often struggle with complex shadow scenarios, leading to missed or false detections. This paper introduces a novel approach that enhances shadow detection accuracy and reliability in high-resolution remote sensing images.(Method)The proposed dual-branch network synergistically combines the strengths of Transformer and Convolutional Neural Networks (CNNs) to tackle the challenges of shadow detection. The network leverages a Transformer branch to capture global contextual relationships and a CNN branch to emphasize local textural details. This architecture is designed to exploit the complementary nature of global and local information, providing a comprehensive feature representation. This method also introduces a shadow prediction module that integrates these features for effective shadow segmentation. A joint loss function, comprising a primary loss and auxiliary losses, is utilized to refine learning and accelerate convergence, thereby enhancing the detection accuracy.(Result)The proposed method was rigorously tested on the Aerial Imagery Shadow Dataset (AISD), demonstrating substantial improvements in shadow detection metrics. It achieved a shadow detection accuracy of 97.112% and significantly reduced the false detection rate, with a balance error rate (BER) decrease of 0.389. These results not only validate the effectiveness of the dual-branch architecture but also showcase the advantages of integrating global and local features through our innovative network design.(Conclusion)The dual-branch network provides a robust solution to the perennial challenges of shadow detection in remote sensing imagery. By effectively minimizing missed and false detections, the network holds significant promise for enhancing the interpretability and utility of high-resolution satellite images in various applications, such as urban planning and environmental monitoring. The future work will focus on optimizing the network architecture and exploring its applicability to other complex imaging conditions.
JIA Wen, PANG Yong, LI Zengyuan, KONG Dan, LIANG Xiaojun
Corrected Proof
DOI:10.11834/jrs.20244240
摘要:ObjectiveThis research aims to examine the key factors influencing the accuracy of tree species classification using airborne hyperspectral data combined with Light Detection and Ranging (LiDAR) in forest environments. Accurate identification of individual tree species is essential for effective forest resource monitoring, management, ecosystem assessment, and biodiversity conservation. While many small-scale studies have explored tree species classification in forests with diverse species compositions and complex age structures, achieving this over larger areas remains a significant challenge. This study focuses on evaluating the effects of spectral consistency correction, canopy height information, and individual tree canopy segmentation on classification accuracy. Saihanba Mechanical Forest Farm, a large-scale artificial plantation, was selected as the study site to explore these factors.MethodTo assess the impact of different factors on tree species classification accuracy, the research utilized a Random Forest classification algorithm and developed four distinct classification strategies. The first strategy used vegetation indices derived from multi-flightline images without applying Bidirectional Reflectance Distribution Function (BRDF) correction. The second strategy incorporated BRDF correction into the multi-flightline images before deriving vegetation indices. The third approach integrated canopy height information, specifically the Canopy Height Model (CHM), with the BRDF-corrected vegetation indices. The fourth and final strategy combined BRDF-corrected vegetation indices, CHM, and individual tree canopy segmentation data. The classification accuracy of each strategy was systematically compared to quantify the contribution of each factor toward improving tree species classification precision.ResultThe results indicated that individual tree canopy segmentation significantly reduced misclassification errors arising from the mixing of multiple species within a single canopy, leading to a notable 10.74% improvement in classification accuracy. Using the Random Forest model’s feature importance ranking, individual tree segmentation emerged as the most critical factor, followed by BRDF correction, and then the canopy height model. Although BRDF correction reduced spectral reflectance variability caused by differing sun-observation geometries across flight strips, it only led to a modest improvement in classification accuracy of 3.48%. The introduction of the Canopy Height Model (CHM) yielded minimal gains in accuracy, contributing just 0.67%, particularly in areas with uniform vertical forest structures or species spanning multiple age cohorts.ConclusionThis study demonstrates that integrating airborne hyperspectral data with LiDAR holds substantial promise for enhancing tree species classification in large-scale artificial plantations. Among the factors analyzed, individual tree segmentation proved to be the most impactful in improving accuracy. In contrast, the relatively minor influence of BRDF correction and canopy height features underscores the need for further refinement and optimization. Overall, the findings emphasize the importance of considering multiple factors in remote sensing workflows to enhance the efficiency and accuracy of forest resource monitoring, management, and other forestry-related applications, especially in expansive forest environments. These insights provide a valuable theoretical foundation and practical recommendations for future forest management and ecological monitoring efforts.
关键词:Tree Species Classification;airborne hyperspectral data;BRDF correction;LIDAR data;individual tree segmentation;Random Forest;vegetation indices;Saihanba mechanized forest farm
摘要:The issue of global warming has become increasingly prevalent in recent years. Concurrently, there is a considerable prevalence of extreme meteorological occurrences in urban environment, exemplified by the intense heat that characterizes the summer season. The urban heat environment has become a research focus under the background of global warming and rapid urbanization. At present, Local Climate Zone (LCZ) represents the principal method of classification employed in the field of urban thermal environment research. In comparison with the traditional urban-rural dichotomy, this approach entails a further subdivision of the city on the basis of the physical characteristics of the buildings and the natural ground cover features. Based on the Local Climate Zone (LCZ) system, this paper investigated the summer thermal environment characteristics of the main urban area of Nanjing from two perspectives: interclass and intraclass, using Landsat image inversion of surface temperature. The classification of local climate zones divided the study area into 12 categories, of which 8 were designated for building types and 4 were designated for surface cover types. The proportion of building types within the study area was greater than that of ground cover types. The building types exhibited a high proportion of open high-rise (LCZ 4) and dense mid-rise (LCZ 2), which were predominantly concentrated in the central urban areas. The largest surface cover type was bare soil and sand (LCZ F). The result found that, firstly, the thermal environments among LCZ classes showed large differences. Higher building densities had higher mean LSTs. The mean LSTs tend to rise gradually as building height decreased. The time-series trend of mean temperature for the various LCZ types was highly consistent with the overall mean temperature trend observed in the study area. Besides, Large low-rise (LCZ 8) consistently presented high average surface temperatures during the summer months, reaching a maximum of 53.2 degrees Celsius; Second, the average surface temperature for each building type was higher than the average surface temperature for the study area as a whole, and the average surface temperature for each natural ground cover type except bare soil or sand was lower than the average surface temperature for the study area as a whole. The mean surface temperature of compact mid-rise (LCZ 2), compact low-rise (LCZ 3), large low-rise (LCZ 8), and heavy industry (LCZ 10) were higher than the overall mean temperature of the study area. Furthermore, this study presented intraclass analysis of different LCZ types using relative rates of change in LST. An increased sensitivity to temperature fluctuations may have adverse effects on human well-being and economic productivity. Another important finding was that, the intra-LCZ thermal environment analyses indicate that there is a heightened sensitivity to temperature fluctuations in the following categories: compact mid-rise (LCZ 2), compact low-rise (LCZ 3), heavy industry (LCZ 10), bare soil and sand (LCZ F). The findings of this study can serve as a valuable reference point and provide insights for further research in the fields of urban planning, the mitigation of the urban heat island effect, and the enhancement of the urban heat environment.
SUN Xidong, FU Bolin, LI Huajian, JIA Mingming, SUN Weiwei, WU yan, SONG Yiji
Corrected Proof
DOI:10.11834/jrs.20243431
摘要:Time-series accurate monitoring of vegetation and water conditions by hyperspectral remote sensing is the key and foundation for accurate assessment and comprehensive monitoring of karst wetland ecosystem. However, the spatial resolution of the existing satellite hyperspectral images is low, which could hardly capture the complex spatial details of the wetland vegetations, while the super-high-resolution UAV images could hardly realize the time-sequence monitoring of the large-scale wetland scenes. The existing fusion methods could not well realize the non-destructive fusion of the spatial and spectral features of the hyperspectral images from the above two kinds of platforms. In order to solve the problems, this paper propose a cross-platform multi-scale image feature mapping module (Cross-Sensor Multiscale Image Feature Mapping Module, CMIFM). This module unifies the spatial scale of UAV hyperspectral image (Aerial hyperspectral image, AHSI) and satellite hyperspectral image (Spaceborne hyperspectral image, SHSI), maps AHSI and SHSI into the same spectral characteristic space according to the measured ASD (Analytical Spectral Devices) data, integrates the spatial- and spectral- feature fusion data of AHSI and SHSI to construct the image feature datasets. The high-quality image reconstruction of SHSI could be achieved by training feature datasets into super-resolution networks (ESRGAN and SwinIR). Meanwhile, this study used the latest deep-learning (DATFuse) and traditional (GS) fusion methods to compare the spatial- and spectral- quality of vegetations and water between the reconstructed and fused images in wetland scenes. This study highlights that: (1) CMIFM-based super-resolution network could realize cross-platform enhancement of spatial characteristics of detail information for wetland vegetation and water in SHSI by learning AHSI features, which could outperform the GS image fusion method in visual perception and quantitative indexes, and the average PSNR and SSIM accuracies of the reconstructed images are 11.06 and 0.3102, respectively. (2) the spectral features of three typical wetland vegetation communities (Cynodon-dactylon, Cladium chinense Nees and Miscanthus) and wetland water in the reconstructed images exhibit higher stability and fidelity based on the measured ASD data, and the average RMSE and R2 accuracies of the spectral bands are higher than the DATFuse and GS fusion images. (3) the CMIFM+ESRGAN and CMIFM+SwinIR methods provide strong generalization ability in terms of spatial- and spectral- reconstruction performance, and could be able to complete the reconstruction of the image in wetland scenes where AHSI is not covered, with the average PSNR and R2 of 12.74 and 0.1897, respectively, which are close to the range of accuracies’ values for the AHSI-covered area. (4) this paper verified the feasibility of CMIFM based super-resolution technology in hyperspectral reconstruction images of complex wetlands.