Title |
Cubic Spline Interpolation Square-root Compute Unit for Cost-efficient Batch-normalization Calculation of Accurate DNN Training |
DOI |
https://doi.org/10.5573/ieie.2025.62.2.19 |
Keywords |
Square-root; Brain floating point; Batch normalization; Standard deviation calculation |
Abstract |
This paper proposes a Cubic Spline Interpolation based square-root (CSISR) compute unit for standard deviation computation required during the batch-normalization layer of Deep Neural Network (DNN) training. This research introduces a non-linear square-root computation method that uses cubic spline interpolation, which reduces resource consumption and achieves higher accuracy compared to linear interpolation methods. In particular, it reduces the hardware area by eliminating the needs for lookup tables which often incur large on-chip memory. We implementing the proposed CSISR in Verilog HDL using Brain floating point 16bit (Bfloat16) data format and applied it to the batch-normalization layer of a object detection model YOLOv2 DNN. Its training result shows similar accuracy to the result of original PyTorch model with GPU. A low average error rate of 0.1915% is obtained. We implemented the proposed CSISR compute unit based on the bfloat16 using TSMC 180nm process. The implementation demonstrates a total chip area of 889.2 μm², which is 86% smaller than the previous circuit [15] (SRT algorithm/fabrication: 65nm/area: 6450.84㎛2/power: 0.764mW), while it offers a power consumption of 0.1572 mW, which is 79 % reduction compared with [15]. |