Mobile QR Code
Title Mitigating Korean Semantic Ambiguity and Improving Classification Performance via Cross-attention-based Fusion of English Multi-representations
Authors 이태윤(Tae-Yoon Lee) ; 이승호(Seung-Ho Lee)
DOI https://doi.org/10.5573/ieie.2026.63.4.104
Page pp.104-109
ISSN 2287-5026
Keywords Agglutinative language; Supervised contrastive learning; Multilingual representation; Morphological awareness; Natural language processing
Abstract In this paper, we proposes a method of mitigating Korean semantic ambiguity and improving classification performance via cross-attention-based fusion of English multi-representations. Despite advancements in NLP, a performance gap persists for Korean due to its complex ending inflections and structural ambiguity compared to English-based models. To address this, our approach utilizes a machine-translated parallel corpus and reinforces fine-grained morphological details by extracting Jamo and character-level N-gram TF-IDF as auxiliary features. These heterogeneous features are fused via Cross-Attention, leveraging English's clear sentence constituents as cues to mutually complement Korean's semantic uncertainty. Furthermore, combining Supervised Contrastive Loss with Cross-Entropy Loss during training ensures model robustness by increasing vector cohesion against input variations. Experimental results on the K-MHaS dataset show that the proposed model achieves an F1-weighted score of 0.9317 and an accuracy of 0.9212, demonstrating significant improvement over existing monolingual models.