Mobile QR Code
Title Techniques for Improving Video-based Object Detection Performance using Continuous Past Information
Authors 편집부(Editor)
DOI https://doi.org/10.5573/ieie.2025.62.2.58
Page pp.58-66
ISSN 2287-5026
Keywords Object detection; Multi-object tracking; Video object detection; Temporal information
Abstract Recent research on deep learning-based object detection using single images has been actively studied. However, since single images offer less usable information compared to continuous video, object detection relying solely on single images has several limitations. This paper proposes techniques to enhance video-based object detection performance by continuous video information that cannot be obtained from single images. Specifically, we introduce three methods: 1) Re-Assess Score (RAS) module for re-assess scores based on past information, 2) Adaptive Template Matching (AT) module for supplementing missed detection, and 3) Label Voting (LV) module for correcting mis-classification. These proposed techniques aim to improve not only the issues of mis-classification and missed detection that can arise from single images but also the overall detection performance. The methods proposed in this paper designed as a plug-in tracking-based module that can be applied to any State-of-the-Art (SOTA) object detection model. Experiments were conducted with the CNN-based YOLOX and the Transformer-based RT-DETRv2 models, resulting in an increase of 1.4% and 3.6% in mAP (Mean Average Precision) compared to the existing results using only single images. In addition, the generalization performance was validated on a self-constructed dataset to demonstrated the robustness of the proposed method.