Mobile QR Code
Title An Improved SW-MSA using Vanishing Point Position Information for Monocular Depth Estimation
Authors 조용석(Yongseok Jo) ; 김나라(Nara Kim) ; 박호현(Ho-Hyun Park)
DOI https://doi.org/10.5573/ieie.2024.61.3.68
Page pp.68-77
ISSN 2287-5026
Keywords Deep learning; Depth estimation; Monocular depth estimation; Swin transformer; Vanishing point
Abstract This paper proposes a Swin Transformer-based depth estimation model using vanishing point detection and improved SW-MSA in depth estimation through a monocular lens. This model, upon receiving a image, searches for the vanishing point, then identifies the type based on the location of the vanishing point, and conveys information helpful to the depth estimation model. Inference of the vanishing point position first involves extracting the outlines from a image using the Canny edge detector, then retaining only the line components through Hough transformation. These lines are then extended to determine the vanishing point as the area where the most line intersections occur. The types of vanishing point positions are classified into three categories, and the self-attention mechanism of SW-MSA varies according to the type. The performance of the proposed model demonstrates better results than the existing monocular depth estimation models, as shown by experimental result. This paper emphasizes the use of technology that identifies the intrinsic characteristics of an image by estimatioing monocular depth through the geometric feature of vanishing points, thereby not relying on training data. This paper makes a significant contribution to the field of depth estimation, emphasizing the potential of the technology by utilizing the concept of vanishing points from perspective in depth estimation.