Mobile QR Code
Title A Study on Filter Pruning Method for AcceleratingObject Detection Network Inference in Embedded Systems
Authors 전지훈(Jihun Jeon) ; 김재명(Jaemyung Kim) ; 강 진 구(Jin-Ku Kang) ; 김용우(Yongwoo Kim)
DOI https://doi.org/10.5573/ieie.2022.59.3.69
Page pp.69-77
ISSN 2287-5026
Keywords Deep learning; Embedded system; Object detection; Filter pruning; FLOPs
Abstract Recently, convolutional neural networks (CNNs), which exhibit excellent performance in the field of computer vision, have been in the spotlight. This was made possible by the increase of data and the improvement of hardware performance such as GPU, but as the network became deeper and wider for high performance, parameters and computational amount increased exponentially. Therefore, it has become more difficult to utilize a large network in an embedded environment where memory, computational performance, and power usage are limited. To solve this problem, a pruning technique that removes insignificant parameters while maintaining the accuracy of the CNN model is being actively studied. However, existing studies dealing with most of the pruning techniques showed results for reduced parameters and FLOPs reduced by parameter removal. In this paper, we propose a filter pruning method that can create a network with accelerated inference speed by reducing the number of parameters together with the FLOPs by a desired ratio. To evaluate the performance of the proposed filter pruning method, VisDrone data set and YOLOv5 were used, and the inference speed of the lightweight network after pruning was measured using the NVIDIA Jetson Xavier NX platform. As a result of pruning the parameters and FLOPs by 30%, 40%, and 50%, respectively, the mAP (0.5:0.95) decreased 0.6%, 0.9%, and 1.2% compared to the reference object detection network, whereas inference time was confirmed to be improved by 16.60%, 25.79%, and 30.72%, respectively.