| Title |
Amodal Instance Segmentation via Dual Decoders with Positional Relation Encoding |
| Authors |
조은남(Eunnam Cho) ; 김승욱(Seung Wook Kim) ; 홍성은(Sungeun Hong) |
| DOI |
https://doi.org/10.5573/ieie.2026.63.2.124 |
| Keywords |
Amodal instance segmentation; Occlusion; Relation encoding |
| Abstract |
Amodal instance segmentation aims to reconstruct not only the visible regions of objects but also the occluded parts, predicting the full shape to achieve a more comprehensive understanding of scenes. While humans can naturally infer the complete form of partially occluded objects, existing computer vision models often struggle under occlusion, focusing primarily on visible cues. In this paper, we propose an extended DETR-based transformer framework that incorporates a dual-decoder design to separately infer modal and amodal regions, and encodes physical spatial relationships between objects as positional biases injected into the self-attention operations. This approach allows the model to generate more coherent and realistic amodal masks by considering not only appearance features but also global spatial context. Experimental results on the KINS, D2SA, and COCOA-cls benchmarks demonstrate that our method outperforms previous state-of-the-art methods, particularly excelling under challenging occlusion scenarios. We believe our approach can serve as a strong foundation for advancing amodal segmentation performance in future complex scene understanding tasks. |