Mobile QR Code QR CODE
Title [REGULAR PAPER] A 9.52 ms Latency, and Low-power Streaming Depth-estimation Processor with Shifter-based Pipelined Architecture for Smart Mobile Devices
Authors (Sungpill Choi) ; (Kyuho Jason Lee) ; (Youngwoo Kim) ; (Hoi-Jun Yoo)
DOI https://doi.org/10.5573/JSTS.2020.20.3.255
Page pp.255-270
ISSN 1598-1657
Keywords Energy-efficient digital circuit; depth-estimation; stereo vision; low power; high throughput ASIC; image processing; memory-efficient design
Abstract The 3D hand gesture interface (HGI) for virtual reality and mixed reality on smart mobile devices is strongly dependent upon the robust depth-estimation with low latency and power consumption. However, the conventional depth-estimation hardware such as active depth sensors and stereo matching accelerators cannot realize the always-on and natural 3D HGI on mobile platform due to their large power consumption from active depth sensors and computations as well as the massive external memory bandwidth, respectively. To resolve the limit, we propose a depth-estimation processor that realizes the always-on and natural 3D HGI with algorithm and hardware co-optimization. The processor features: 1) shifter-based adaptive support weight aggregation that replaces complex floating-point operations with integer operations to reduce power and bandwidth by 92.2% and 69.1%; 2) line streaming 7-stage pipeline architecture with aggregation pipeline reordering optimization to realize 94% utilization and 43.9% memory reduction; and 3) shifting register-based pipeline buffer optimization to reduce 29.8% area. The proposed depth-estimation processor realizes a real-time 3D HGI with 9.52 ms of latency under QVGA stereo inputs. It achieves external memory bandwidth reduction to 18.93 MB/s with 15.56 mW power and 2.8 mm2 area, which are 4.1x and 6.9x more efficient than state-of-the-arts [9, 10], respectively.