Mobile QR Code
Title A Convolutional Layer Compression Method Combining Factorization and Pruning for Neuromorphic Systems
Authors 정필영(Pilyeong Jeong) ; 정재용(Jaeyong Chung)
DOI https://doi.org/10.5573/ieie.2019.56.3.42
Page pp.42-51
ISSN 2287-5026
Keywords Deep Learning ; Deep Neural Networks ; Neuromorphic Computing ; Sparse Networks ; Compressing Parameters
Abstract The size of deep learning models continues to increase because the predictive performance of deep learning usually scales well with the model size if enough training data is provided. In order to improve the efficiency of the large models in terms of computational time, memory footprint, storage size, etc, model compression techniques can be applied. Most existing techniques are usually developed in the context of traditional processors such as GPUs and CPUs. This paper proposes a model compression technique for neuromorphic computing systems. The proposed method targets at convolutional layers and combine the two well-known methods, low-rank approximation and pruning, taking advantages of both. Our experimental results show that the proposed method can reduce the number of parameters in AlexNet up to 10X without significant loss of accuracy.