IEIE - Journal of the Institute of Electronics and Information Engineers

Mobile QR Code

Main Menu


Title	Efficient Prompt Fusion for RGB-D Semantic Segmentation
Authors	편집부(Editor)
DOI	https://doi.org/10.5573/ieie.2025.62.7.56
Page	pp.56-62
ISSN	2287-5026
Keywords	Multimodality; RGB-D; Segmentation; Prompt learning
Abstract	RGB-D semantic segmentation is a research field that addresses scene understanding challenges that are difficult to solve using only RGB information by incorporating depth data. This study applies prompt learning techniques to RGB-D semantic segmentation, enhancing performance by adding a minimal number of parameters while maintaining the original model structure. In particular, the post-fusion prompt method is a simple yet effective approach that minimizes information loss and maximizes interaction between the two modalities. The superiority of the post-fusion approach over the pre-fusion method was experimentally validated on the NYUv2 and SUN RGB-D datasets. In the case of the NYUv2 dataset, our method outperformed MultiMAE (Multimodal Multitask Masked Autoencoders), a representative multimodal learning approach, by approximately 2.2% in mIoU. These findings suggest new possibilities for prompt learning in the fusion process of RGB and depth information.

IEIEJournal of
the Institute of Electronics and Information Engineers