Application of an improved CenterNet in remote sensing images object detection
- Vol. 27, Issue 12, Pages: 2706-2715(2023)
Published: 07 December 2023
DOI: 10.11834/jrs.20231638
object detection methods based on deep learning are widely used in the interpretation of remote sensing images. The anchor-based methods usually need to design the anchor boxes first
which requires more detection steps and time cost. This study proposed an object detection method of remote sensing images based on the improved CenterNet. The method can simplify the object detection process and improve efficiency.
The CenterNet uses a fully convolutional network to directly predict the heat map of the center points
and heights of the corresponding objects
and the position offsets of the center points. The heat maps are used to generate the rough positions of the objects
and the offsets can fine-tune the positions to make them more accurate. The widths and heights further constitute the shape of the object boxes. The different heat maps decide the object categories. On the basis of CenterNet
the proposed method first adopts the ResNet with transposed convolution as the backbone network. The transposed convolution can expand the output feature maps
and ResNet can reduce the number of parameters in the backbone network compared with the Hourglass network. Second
the proposed method defines the length of Gaussian kernel under three limit conditions between the predicted and real boxes in CenterNet. The Gaussian kernel is applied to generate the heat map label
which is used for network training. Finally
the multi-head attention mechanism is introduced into the backbone network to learn the importance of each element in the feature maps. The weights assigned to the elements reflect their effectiveness
which makes the effective features concentrate in the regions of the object key points as much as possible.
The experiments use mean Average Precision (mAP) to evaluate the object detection results on the multiple categories. All the experiments are conducted at the DIOR dataset. The results show that the CenterNet using the ResNet with transposed convolution is 1.4% higher than that using the Hourglass. The proposed calculation of the length of the Gaussian kernel can increase the mAP by 1.1%. The addition of attention mechanism can further improve the mAP by 1.5%. At the same time
the proposed method reduces the time cost by 31.9% compared with the conventional method.The experimental results show that the proposed method can improve detection accuracy without sacrificing the detection speed. The ablation experiments of different parts also show that the ResNet with transposed convolution
the designed calculation method of the length of the Gaussian kernel
and the attention mechanism can effectively improve the mAP. The comparison with other methods also proves that the proposed method is practical.
remote sensing imageobject detectiondeep learningCenterNetattention mechanism
