Title |
Efficient Workload Division for Binarized-Convolutional-Neural-Network-Based Inference |
Authors |
최경찬(Gyungchan Choi) ; 김태환(Tae-Hwan Kim) |
DOI |
https://doi.org/10.5573/ieie.2020.57.6.19 |
Keywords |
Binary convolutional neural networks; workload division; inference; acceleration; VLSI |
Abstract |
This paper studies the previous methods of the workload division for the inference based on the binary convolutional neural networks and proposes a novel method based on the output-channel-wise workload division. The proposed method, which divides the weight in the output-channel direction, reduces the off-chip memory access and does not entail additional computations. The experimental results for the ImageNet classification task show that the proposed method reduces the on-chip memory size and off-chip memory access by 89% and 88%, respectively, when compared to the previous method. Furthermore, no additional computations are entailed. |