Mobile QR Code
Title Performance Analysis of Deep Learning Accelerator for Edge Inference
Authors 편집부(Editor)
DOI https://doi.org/10.5573/ieie.2024.61.1.23
Page pp.23-26
ISSN 2287-5026
Keywords Deep learning; Deep learning accelerator; Inference; Edge device; Performance analysis
Abstract Inference accelerators are currently being utilized for deep learning inference on edge devices. Deep learning inference accelerators can enhance computational performance and energy efficiency. However, it is important to note that optimal performance cannot be achieved if the model structure and settings (e.g. hyperparameters) are not optimized for the accelerator, which can result in overheads, such as frequent memory access. This study analyses the inference performance of the Graphic Processing Unit (GPU) and the Deep Learning Accelerator (DLA) on NVIDIA Jetson with pre-trained MobileNet v2 and ResNet50 v1 models. The results of our experiments show that running non-optimized models on the DLA results in up to 5.1 times longer inference time compared to the GPU. This paper showed through profiling that the increase in inference time is due to the overhead of GPU fallback to perform operations not supported by DLA.