Title |
Hardware Implementation of Convolution Layer for Quantized Deep Neural Networks in Image Classification |
Authors |
이종윤(Jong-Youn Lee) ; 서정윤(Jeong-Yun Seo) ; 박성준(Sung-Jun Park) ; 이하림(Harim Lee) ; 이용환(Yong-Hwan Lee) |
DOI |
https://doi.org/10.5370/KIEE.2025.74.4.644 |
Keywords |
Convolution layer; Hardware; Verilog HDL; Deep learning |
Abstract |
In this paper, we introduce the design of an artificial intelligence (AI) hardware accelerator for a convolutional neural network (CNN) for classification tasks. The architecture of the AI hardware for a CNN model consists of two modules: a striding-padding (SP) module and a convolution layer module. For the classification task, fully-connected layers should follow the CNN model, and hence we also briefly explain the architecture of the fully-connected layer. In the architecture of a CNN model, the SP module manages how to access memories including feature maps and weights by considering the striding and padding operation of a convolution layer, and the convolution layer conducts convolution operation with feature maps and weights that are read from the memories. To verify the AI hardware design through RTL simulation, CNN models are trained in software using a MNIST dataset and a Fashion MNIST dataset. Then, post-training quantization and quantization-aware training, which are representative quantization techniques, are utilized to extract hardware-appropriate weights from the trained CNN models. The quantized weights are loaded to memories, and the validation data from the dataset used to train CNN models are fed to the designed CNN hardware. Finally, the inference results of the hardware are compared with that of the software-trained CNN models, which confirms the correctness of the hardware design. |