IEIE - Journal of the Institute of Electronics and Information Engineers

Mobile QR Code

Main Menu

Journal Search


Title	An Improved SW-MSA using Vanishing Point Position Information for Monocular Depth Estimation
Authors	조용석(Yongseok Jo) ; 김나라(Nara Kim) ; 박호현(Ho-Hyun Park)
DOI	https://doi.org/10.5573/ieie.2024.61.3.68
Page	pp.68-77
ISSN	2287-5026
Keywords	Deep learning; Depth estimation; Monocular depth estimation; Swin transformer; Vanishing point
Abstract	This paper proposes a Swin Transformer-based depth estimation model using vanishing point detection and improved SW-MSA in depth estimation through a monocular lens. This model, upon receiving a image, searches for the vanishing point, then identifies the type based on the location of the vanishing point, and conveys information helpful to the depth estimation model. Inference of the vanishing point position first involves extracting the outlines from a image using the Canny edge detector, then retaining only the line components through Hough transformation. These lines are then extended to determine the vanishing point as the area where the most line intersections occur. The types of vanishing point positions are classified into three categories, and the self-attention mechanism of SW-MSA varies according to the type. The performance of the proposed model demonstrates better results than the existing monocular depth estimation models, as shown by experimental result. This paper emphasizes the use of technology that identifies the intrinsic characteristics of an image by estimatioing monocular depth through the geometric feature of vanishing points, thereby not relying on training data. This paper makes a significant contribution to the field of depth estimation, emphasizing the potential of the technology by utilizing the concept of vanishing points from perspective in depth estimation.

Copyright © IEIE All right's reserved

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution and reproduction in any medium, provided the original work is property cited.