IEIE - Journal of the Institute of Electronics and Information Engineers

Mobile QR Code

Main Menu


Title	A Low Area GEMM Accelerator Architecture for Edge Devices
Authors	전영황(Young-Hwang Jeon) ; 김희탁(Hee-Tak Kim) ; 김병수(Byung-Soo Kim) ; 황태호(Tae-Ho Hwang)
DOI	https://doi.org/10.5573/ieie.2024.61.7.43
Page	pp.43-50
ISSN	2287-5026
Keywords	Deep neural network; Edge AI; Edge device; GEMM; Hardware accelerator
Abstract	The main issue when adopting the systolic array for the GEMM accelerator is that exponentially more compute units are required for processing more data in parallel. For instance, the systolic array includes N2 number of compute units when processing N number of input data in parallel. Therefore, in this article, we propose an adder-tree based GEMM accelerator which includes totally 2N-1 number of compute units (N number of multipliers and N-1 number of adders) when processing N number of data in parallel. Accordingly, the proposed architecture reduced a lot of compute units than using the systolic array. Furthermore, we proposed not only an algorithm that reduces the external memory access by reusing data as much as possible in accelerator, but also a pipelined hardware architecture that enables high throughput performance. The proposed accelerator uses floating-point units and it has been synthesized under 40nm CMOS process, which achieved an area of 49831.59㎛² and a maximum frequency of 580MHz.

IEIEJournal of
the Institute of Electronics and Information Engineers