Mobile QR Code
Title A Study on the Performance Evaluation of Edge Device according to the Compression Method of Deep Learning Model
Authors 최진욱(Jin-Wook Choi) ; 최수혁(Soo-hyuck Choe) ; 정은성(Eun-Sung Jung)
DOI https://doi.org/10.5573/ieie.2021.58.6.50
Page pp.50-60
ISSN 2287-5026
Keywords Compression; Edge device; Knowledge Distillation; Pruning; Quantization
Abstract Based on robust GPU-based computing resources, a large number of high-performance deep learning models have emerged. There is an increasing need for model compression techniques to apply these models to IoT/Edge device environments without sufficient computing resources. This paper retrained the ResNet50 and VGG19 models trained with ImageNet dataset in the Tensorflow 2 Framework with CIFAR-10 dataset using transfer learning. We then utilized three model compression techniques (Knowledge Distillation, Weight Pruning, and Quantization) for comparative study. In the knowledge distillation technique, the size and reference execution time of the model was determined by the student model, whereas the learning volume had a large impact on accuracy. In the weight pruning technique, the sparsity of the weights determined accuracy and model size. The quantization technique showed a high compression rate of file size and accuracy, but the inference time is significantly longer than other model compression techniques. Suppose we can achieve sufficient accuracy when only quantization is applied after training without going through quantization recognition training. In that case, quantization recognition training may be unnecessary and may even lead to a decreased accuracy.