Multi-object tracking by detecting small objects in satellite video
- Vol. 28, Issue 7, Pages: 1812-1821(2024)
Received:07 March 2022,
Published:07 July 2024
DOI: 10.11834/jrs.20232098
移动端阅览


浏览全部资源
扫码关注微信
Received:07 March 2022,
Published:07 July 2024
移动端阅览
遥感卫星的多目标跟踪任务面监目标弱小,场景多样等挑战。为此,提出了一种高分辨率遥感卫星视频的多目标跟踪算法。在检测阶段,构建小目标检测器,首先在主干网络中通过Transformer捕获全局的上下文信息,然后利用注意力机制增强目标特征,最后添加了一个预测小目标的分支;在轨迹关联阶段,将检测出的小目标与已有轨迹匹配,采用关注低置信度检测的关联算法。本文选取高分辨率遥感卫星视频进行测试,测验结果表明本文提出的方法在遥感卫星视频中的多目标跟踪数据集上的MOTA指标达到63.1%,相较于基准(baseline)模型提升13.5%,能够显著提升遥感卫星视频中多目标跟踪的性能。
Multi-object tracking determines the position of an object and estimates the trajectory of objects in remote sensing satellite videos. This method has attracted considerable interest
and its application to security monitoring
motion analysis
and intelligent transportation has been explored. Compared with surveillance videos
remote sensing satellite videos contain smaller objects and a larger background
and thus the foreground object is difficult to detect. In addition
remote sensing satellite videos are extremely large
requiring massive computation and storage. Multi-object tracking in remote sensing satellite videos have high real-time requirements. Based on the mentioned problems
a multi-object tracking method for remote sensing satellite videos is proposed in this paper
which adopts tracking-by-detection paradigm. First
the backbone added a transformer that capture the global context information in the detection stage
enabling the detector to distinguish between objects and background. Then
an attention mechanism was used to enhance objects’ features
enabling the proposed method to focus on the region of objects. Finally
an extra prediction branch was added to the network to generat a high-resolution feature map
which retained the details of small objects and was beneficial to small-object detection. Owing to the small objects and occlusion in remote sensing satellite videos
the confidence of hard positive samples was quite low. In the data association stage
an association strategy was adopted
which considered high and low confidence detection simultaneously and associated detected small objects with existing trajectories. To verify the effectiveness of the proposed method
ablation and comparison experiments were carried out on the remote sensing satellite videos dataset. The proposed method achieved 63.1% MOTA and 78.0% IDF1. The proposed method showed optimal performance
which reflected its suitability for multi-object tracking in remote sensing satellite videos. The proposed method ranked second in the multi-object tracking challenge of the 2021 Gaofen Challenge. The proposed method was dedicated to solving the difficulty of small-object tracking in remote sensing satellite videos
and some helpful methods for small-object tracking were used. Experimental results showed that the proposed method can improve the performance of multi-object tracking in remote sensing satellite videos.
Bahmanyar R , Azimi S M and Reinartz P . 2019 . Multiple vehicles and people tracking in aerial imagery using stack of micro single-object-tracking CNNs . The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-4/W 18 : 163 - 170 [ DOI: 10.5194/isprs-archives-XLII-4-W18-163-2019 http://dx.doi.org/10.5194/isprs-archives-XLII-4-W18-163-2019 ]
Bernardin K and Stiefelhagen R . 2008 . Evaluating multiple object tracking performance: the CLEAR MOT metrics . EURASIP Journal on Image and Video Processing , 2008 : 246309 [ DOI: 10.1155/2008/246309 http://dx.doi.org/10.1155/2008/246309 ]
Bewley A , Ge Z Y , Ott L , Ramos F and Upcroft B . 2016 . Simple online and realtime tracking // Proceedings of the 2016 IEEE International Conference on Image Processing . Phoenix : IEEE: 3464 - 3468 [ DOI: 10.1109/ICIP.2016.7533003 http://dx.doi.org/10.1109/ICIP.2016.7533003 ]
Bochkovskiy A , Wang C Y and Liao H Y M . 2020 . YOLOv4: optimal speed and accuracy of object detection . arXiv : 2004 . 10934 v 1 [ DOI: 10.48550/arXiv.2004.10934 http://dx.doi.org/10.48550/arXiv.2004.10934 ]
Chen L , Ai H Z , Zhuang Z J and Shang C . 2018 . Real-time multiple people tracking with deeply learned candidate selection and person re-identification // Proceedings of the 2018 IEEE International Conference on Multimedia and Expo . San Diego : IEEE: 1 - 6 [ DOI: 10.1109/ICME.2018.8486597 http://dx.doi.org/10.1109/ICME.2018.8486597 ]
Chu P , Wang J , You Q Z , Ling H B and Liu Z C . 2021 . TransMOT: spatial-temporal graph transformer for multiple object tracking . arXiv : 2104 . 00194 [ DOI: 10.48550/arXiv.2104.00194 http://dx.doi.org/10.48550/arXiv.2104.00194 ]
Du Y H , Zhao Z C , Song Y , Zhao Y Y , Su F , Gong T and Meng H Y . 2023 . StrongSORT: make DeepSORT great again . arXiv : 2202 . 13514 [ DOI: 10.48550/arXiv.2202.13514 http://dx.doi.org/10.48550/arXiv.2202.13514 ]
Feng J , Zeng D N , Jia X P , Zhang X R , Li J , Liang Y P and Jiao L C . 2021 . Cross-frame keypoint-based and spatial motion information-guided networks for moving vehicle detection and tracking in satellite videos . ISPRS Journal of Photogrammetry and Remote Sensing , 177 : 116 - 130 [ DOI: 10.1016/j.isprsjprs.2021.05.005 http://dx.doi.org/10.1016/j.isprsjprs.2021.05.005 ]
Ge Z , Liu S T , Wang F , Li Z M and Sun J . 2021 . YOLOX: exceeding YOLO series in 2021 . arXiv : 2107 . 08430 v 2 [ DOI: 10.48550/arXiv.2107.08430 http://dx.doi.org/10.48550/arXiv.2107.08430 ]
He Q B , Sun X , Yan Z Y , Li B B and Fu K . 2022 . Multi-object tracking in satellite videos with graph-based multitask modeling . IEEE Transactions on Geoscience and Remote Sensing , 60 : 5619513 [ DOI: 10.1109/TGRS.2022.3152250 http://dx.doi.org/10.1109/TGRS.2022.3152250 ]
Jiang X L , Li P Z , Li Y J and Zhen X T . 2019 . Graph neural based end-to-end data association framework for online multiple-object tracking . arXiv : 1907 . 05315 v 1 [ DOI: 10.48550/arXiv.1907.05315 http://dx.doi.org/10.48550/arXiv.1907.05315 ]
Kuhn H W . 1955 . The Hungarian method for the assignment problem . Naval Research Logistics Quarterly , 2 ( 1/2 ): 83 - 97 [ DOI: 10.1002/nav.3800020109 http://dx.doi.org/10.1002/nav.3800020109 ]
Lin T Y , Maire M , Belongie S , Hays J , Perona P , Ramanan D , Dollár P and Zitnick C L . 2014 . Microsoft COCO: common objects in context // Proceedings of the 13th European Conference on Computer Vision . Zurich : Springer: 740 - 755 [ DOI: 10.1007/978-3-319-10602-1_48 http://dx.doi.org/10.1007/978-3-319-10602-1_48 ]
Lu Z C , Rathod V , Votel R and Huang J . 2020 . RetinaTrack: online single stage joint detection and tracking // Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle : IEEE: 14656 - 14666 [ DOI: 10.1109/CVPR42600.2020.01468 http://dx.doi.org/10.1109/CVPR42600.2020.01468 ]
Pang J M , Qiu L L , Li X , Chen H F , Li Q , Darrell T and Yu F . 2021 . Quasi-dense similarity learning for multiple object tracking // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville : IEEE: 164 - 173 [ DOI: 10.1109/CVPR46437.2021.00023 http://dx.doi.org/10.1109/CVPR46437.2021.00023 ]
Rabbi J , Ray N , Schubert M , Chowdhury S and Chao D . 2020 . Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network . Remote Sensing , 12 ( 9 ): 1432 [ DOI: 10.3390/rs12091432 http://dx.doi.org/10.3390/rs12091432 ]
Rezatofighi H , Tsoi N , Gwak J , Sadeghian A , Reid I and Savarese S . 2019 . Generalized intersection over union: a metric and a loss for bounding box regression // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE: 658 - 666 [ DOI: 10.1109/CVPR.2019.00075 http://dx.doi.org/10.1109/CVPR.2019.00075
Sun P Z , Cao J K , Jiang Y , Zhang R F , Xie E Z , Yuan Z H , Wang C H and Luo P . 2021 . TransTrack: multiple object tracking with transformer . arXiv : 2012 . 15460 v 2 [ DOI: 10.48550/arXiv.2012.15460 http://dx.doi.org/10.48550/arXiv.2012.15460 ]
Van Etten A . 2018 . You only look twice: rapid multi-scale object detection in satellite imagery . arXiv : 1805 . 09512 [ DOI: 10.48550/arXiv.1805.09512 http://dx.doi.org/10.48550/arXiv.1805.09512 ]
Vaswani A , Shazeer N , Parmar N , Uszkoreit J , Jones L , Gomez A N , Kaiser Ł and Polosukhin I . 2017 . Attention is all you need // Proceedings of the 31st International Conference on Neural Information Processing Systems . Long Beach : Curran Associates Inc.: 6000 - 6010
Wang J W , Xu C , Yang W and Yu L . 2022 . A normalized Gaussian wasserstein distance for tiny object detection . arXiv : 2110 . 13389 [ DOI: 10.48550/arXiv.2110.13389 http://dx.doi.org/10.48550/arXiv.2110.13389 ]
Wojke N , Bewley A and Paulus D . 2017 . Simple online and realtime tracking with a deep association metric // Proceedings of the 2017 IEEE International Conference on Image Processing . Beijing : IEEE: 3645 - 3649 [ DOI: 10.1109/ICIP.2017.8296962 http://dx.doi.org/10.1109/ICIP.2017.8296962 ]
Woo S , Park J , Lee J Y and Kweon I S . 2018 . CBAM: convolutional block attention module // Proceedings of the 15th European Conference on Computer Vision . Munich : Springer: 3 - 19 [ DOI: 10.1007/978-3-030-01234-2_1 http://dx.doi.org/10.1007/978-3-030-01234-2_1 ]
Wu J , Cao C Q , Zhou Y D , Zeng X D , Feng Z J , Wu Q F and Huang Z Q . 2021a . Multiple ship tracking in remote sensing images using deep learning . Remote Sensing , 13 ( 18 ): 3601 [ DOI: 10.3390/rs13183601 http://dx.doi.org/10.3390/rs13183601 ]
Wu J L , Cao J L , Song L C , Wang Y , Yang M and Yuan J S . 2021b . Track to detect and segment: an online multi-object tracker // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . IEEE : 12347 - 12356 [ DOI: 10.1109/CVPR46437.2021.01217 http://dx.doi.org/10.1109/CVPR46437.2021.01217 ]
Wu J L , Su X , Yuan Q Q , Shen H F and Zhang L P . 2022 . Multivehicle object tracking in satellite video enhanced by slow features and motion features . IEEE Transactions on Geoscience and Remote Sensing , 60 : 5616426 [ DOI: 10.1109/TGRS.2021.3139121 http://dx.doi.org/10.1109/TGRS.2021.3139121 ]
Xiao C , Yin Q , Ying X Y , Li R J , Wu S L , Li M , Liu L , An W and Chen Z J . 2022 . DSFNet: dynamic and static fusion network for moving object detection in satellite videos . IEEE Geoscience and Remote Sensing Letters , 19 : 3510405 [ DOI: 10.1109/LGRS.2021.3124222 http://dx.doi.org/10.1109/LGRS.2021.3124222 ]
Xu C , Wang J W , Yang W and Yu L . 2021 . Dot distance for tiny object detection in aerial images // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Nashville : IEEE: 1192 - 1201 [ DOI: 10.1109/CVPRW53098.2021.00130 http://dx.doi.org/10.1109/CVPRW53098.2021.00130 ]
Zhang H Y , Cisse M , Dauphin Y N . and Lopez-Paz D . 2017 . Mixup: beyond empirical risk minimization. arXiv :1710. 09412 [ DOI: 10.48550/arXiv.1710.09412 http://dx.doi.org/10.48550/arXiv.1710.09412 ]
Zhang Y F , Sun P Z , Jiang Y , Yu D D , Yuan Z H , Luo P , Liu W Y and Wang X G . 2022 . ByteTrack: multi-object tracking by associating every detection box . arXiv : 2110 . 06864 v 3 [ DOI: 10.48550/arXiv.2110.06864 http://dx.doi.org/10.48550/arXiv.2110.06864 ]
Zhang Y F , Wang C Y , Wang X G , Zeng W J and Liu W Y . 2021 . FairMOT: on the fairness of detection and re-identification in multiple object tracking . International Journal of Computer Vision , 129 ( 11 ): 3069 - 3087 [ DOI: 10.1007/s11263-021-01513-4 http://dx.doi.org/10.1007/s11263-021-01513-4 ]
Zhou X Y , Koltun V and Krähenbühl P . 2020 . Tracking objects as points // Proceedings of the 16th European Conference on Computer Vision . Glasgow : Springer: 474 - 490 [ DOI: 10.1007/978-3-030-58548-8_28 http://dx.doi.org/10.1007/978-3-030-58548-8_28 ]
相关作者
相关机构
京公网安备11010802024621