Title |
Profile-based Optimization for Deep Learning on Heterogeneous Multi-core CPUs |
Authors |
차주형(Joo Hyoung Cha) ; 권용인(Yongin Kwon) ; 이제민(Jemin Lee) |
DOI |
https://doi.org/10.5573/ieie.2023.60.7.40 |
Keywords |
Deep learning; Heterogeneous multi-core; Optimization; big.LITTLE; Embedded |
Abstract |
Recently, there has been a growing demand to apply deep learning in embedded environments. In constrained embedded environments, heterogeneous multicore CPU architectures like Arm's big.LITTLE are widely utilized to efficiently carry out deep learning computations. Although Arm provides Arm Compute Library (ACL) for optimal deep learning operations, it does not fully leverage the potential of hardwares with the big.LITTLE structure. This paper proposes a profile-based search method for automatically determining the optimal execution kernel and schedule for each hardware. Experiments were conducted on Tinker Edge R, Odroid N+, and Snapdragon 865 HDK boards using AlexNet, VGG16, MobileNetV2, and GoogleNet models. In all cases, the proposed method improved performance up to 266% compared to existing methods. Through the results of this research, we expect to enable cost-effective, low-power, and high-performance execution of deep learning in embedded devices. |