Mobile QR Code
Title Profile-based Optimization for Deep Learning on Heterogeneous Multi-core CPUs
Authors 차주형(Joo Hyoung Cha) ; 권용인(Yongin Kwon) ; 이제민(Jemin Lee)
DOI https://doi.org/10.5573/ieie.2023.60.7.40
Page pp.40-49
ISSN 2287-5026
Keywords Deep learning; Heterogeneous multi-core; Optimization; big.LITTLE; Embedded
Abstract Recently, there has been a growing demand to apply deep learning in embedded environments. In constrained embedded environments, heterogeneous multicore CPU architectures like Arm's big.LITTLE are widely utilized to efficiently carry out deep learning computations. Although Arm provides Arm Compute Library (ACL) for optimal deep learning operations, it does not fully leverage the potential of hardwares with the big.LITTLE structure. This paper proposes a profile-based search method for automatically determining the optimal execution kernel and schedule for each hardware. Experiments were conducted on Tinker Edge R, Odroid N+, and Snapdragon 865 HDK boards using AlexNet, VGG16, MobileNetV2, and GoogleNet models. In all cases, the proposed method improved performance up to 266% compared to existing methods. Through the results of this research, we expect to enable cost-effective, low-power, and high-performance execution of deep learning in embedded devices.