Mobile QR Code
Title Design and Implementation of High-speed DTW Accelerator for Time-series Data Classification
Authors 최원영(Wonyoung Choi) ; 조재찬(Jaechan Cho) ; 정윤호(Yunho Jung)
DOI https://doi.org/10.5573/ieie.2021.58.3.51
Page pp.51-58
ISSN 2287-5026
Keywords computational complexity; dynamic time warping; similarity measure; time series data
Abstract In this paper, we propose a hardware architecture to reduce the computational complexity of dynamic time warping (DTW), an efficient algorithm in measuring similarity between time series data, and present implementation and experimental results. The DTW method performs an alignment process on a time axis to deal with the time-dependent characteristics of two time series data, and finds the optimal alignment among all possible alignments to accurately measure the similarity. Various methods have been proposed to reduce the high computational complexity of DTW, and recently, studies for additional speed improvement by combining various existing DTW complexity improvement methods are in progress. In addition, in order to further improve the speed, a method of speed improvement by hardware implementation is being studied. The currently proposed optimal hardware architecture for DTW speed improvement complies with the DTW arithmetic rules in which each element of the DTW calculation matrix is affected by the values of the previous neighboring factors, and reduces the execution time by applying the optimized calculation sequence, but some applications with large constraints still require a lot of execution time. Therefore, in this paper, we propose a hardware architecture that can reduce the additional execution time by modifying the existing DTW arithmetic rules and compensating for the results of the modified arithmetic by using an iterative computation technique. As a result of the experiment by the FPGA implementation, the proposed DTW accelerator showed an average reduction rate of about 61.3% of the execution time compared to the existing DTW accelerator, and it operates at 86MHz using about 3,584 slices and 1,090 bits memory.