Title |
DCM: Bins Optimization Module for Elaborate Depth Estimation on Single Image using Monocular Camera |
Authors |
이창엽(Chang Yeop Lee) ; 김동주(Dong Ju Kim) ; 서영주(Young Joo Suh) ; 황도경(Do Kyung Hwang) |
DOI |
https://doi.org/10.5573/ieie.2024.61.10.139 |
Keywords |
Monocular depth estimation; Dense prediction; Group convolution; Deep learning; Computer vision |
Abstract |
This paper proposes a new optimization module for more precise depth estimation in the field of Monocular Depth Estimation (MDE) using a monocular camera.Traditionally, depth images have been directly estimated through a regression approach by inputting a single image into the network and a classification method that pre-defines discretized depth value classes (Bins) to estimate the probability of Bins for each pixel. Recently, research has actively progressed beyond the conventional classification methods towards learnable Bins. The key of this approaches lies in the Bins Optimization Module for estimating the optimal Bins, with various modules being proposed and significant advancements achieved. However, existing modules have limitations in the optimal Bins estimation, such as using the output representations of a layer of the network in a limited way(e.g. Splitter) or estimating Bins indirectly using a tool(e.g. Attractor) rather than these output representations directly estimating Bins. Therefore, to overcome the limitations described above, we propose a Deformable Cardinality Module (DCM) in which 1:1 matching of output representations and Bins is performed. This 1:1 matching method allowed the layer output representations to be independently matched and transformed within each bin, and the representations matched to each bin were estimated in the optimal direction, resulting in improved performance in MDE tasks. In addition, it is possible to estimate more precise Bins by using the Split-Transform-Conversion-Merge (STCM) strategy, which was recently proposed to improve MDE performance, and finally, through comparative experiments, it was confirmed that quantitative performance improvement and visual aspect of normal depth estimation are possible in various evaluation indicators. |