Mobile QR Code
Title Efficient Workload Division for Binarized-Convolutional-Neural-Network-Based Inference
Authors 최경찬(Gyungchan Choi) ; 김태환(Tae-Hwan Kim)
DOI https://doi.org/10.5573/ieie.2020.57.6.19
Page pp.19-27
ISSN 2287-5026
Keywords Binary convolutional neural networks; workload division; inference; acceleration; VLSI
Abstract This paper studies the previous methods of the workload division for the inference based on the binary convolutional neural networks and proposes a novel method based on the output-channel-wise workload division. The proposed method, which divides the weight in the output-channel direction, reduces the off-chip memory access and does not entail additional computations. The experimental results for the ImageNet classification task show that the proposed method reduces the on-chip memory size and off-chip memory access by 89% and 88%, respectively, when compared to the previous method. Furthermore, no additional computations are entailed.