Title |
Speculative Activation Pruning for Efficient LSTM Inference Processing |
Authors |
이준형(Jun-Hyung Lee) ; 오선희(Seon-Hee Oh) ; 김태환(Tae-Hwan Kim) |
DOI |
https://doi.org/10.5573/ieie.2024.61.3.60 |
Keywords |
Deep learning; Long short-term memory; Inference; Activation pruning; Computation amount |
Abstract |
This paper presents a novel method for efficient processing of long short-term memory (LSTM) inference. The proposed method predicts the activation function results and effectively prunes the redundant operations to be performed with the activation function results, which is called the speculative activation pruning. The numbers of the operations that can be pruned are increased by employing non-zero thresholds in the prediction. The proposed method reduces the operations by 11.44% - 37.37% and 1.07% - 38.52% in the LSTM inference for the sequential MNIST and IMDB sentiment analysis tasks, respectively, while the classification accuracies are maintained within the gaps as small as 3.41% and 2.54%, respectively. |