Mobile QR Code
Title Performance Analysis of Neural Processing Units with Emerging Memory Technologies
Authors 최상운(Sangun Choi) ; 박성준(Seongjun Park) ; 박재용(Jaeyong Park) ; 홍석인(Seokin Hong) ; 윤명국(Myung Kuk Yoon) ; 오윤호(Yunho Oh)
DOI https://doi.org/10.5573/ieie.2023.60.7.30
Page pp.30-39
ISSN 2287-5026
Keywords Computer architecture; Neural processing unit; Deep neural network; Memory system
Abstract Recent deep neural networks (DNN) contain an increasing number of parameters. To provision the parameters to neural processing units (NPU), off-chip memory requires a larger capacity and higher bandwidth. Conventional NPUs employ DRAM as the off-chip memory, but DRAM cannot achieve a sustainable scalability in density. To overcome this challenge, prior work has investigated emerging memory technologies as alternatives to DRAM. However, the emerging memory technologies often exhibit lower bandwidth and longer latency than DRAM. As such, designing neural network acceleration systems with NPUs and emerging memory technologies requires a detailed design space exploration in terms of performance and area. This paper performs evaluations the performance per area with various memory technologies while running neural network inference workloads.