IEIE SPC - IEIE Transactions on Smart Processing & Computing

Mobile QR Code QR CODE

QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search


Title	Cross-modal Graphic Retrieval Optimization Method Based on Deep Learning and Hash Learning
Authors	(Lu Tan)
DOI	https://doi.org/10.5573/IEIESPC.2025.14.4.471
Page	pp.471-482
ISSN	2287-5255
Keywords	Graph neural networks; Deep learning; Hash algorithms; Graph retrieval; Multimodality
Abstract	This work proposes a novel approach for cross-modal graphic retrieval, leveraging deep learning and hash learning techniques. It aims to address the limitations of current multimodal information retrieval methods in capturing detailed information within individual modalities. Initially, a deep learning-based model is developed to extract features from text and image modalities. To further enhance the granularity of modality-specific information, a cross-modal hashing retrieval model incorporating graphic features is proposed. This model leverages attention mechanisms and adversarial networks to optimize performance. Experimental results demonstrate the effectiveness of the proposed model, achieving an average recall of 77.8% in graphic feature extraction with the highest classification precision of 0.637 and 0.712 on two separate datasets. Furthermore, the cross-modal hash retrieval model achieves an impressive average precision mean value of 0.833 in the image retrieval text task using a 64-bit hash code. These findings indicate that the proposed model surpasses comparable models in terms of precision-recall curve. The attentional mechanism, intermodal confrontation, and intra-modal confrontation modules significantly contribute to the model’s performance in image and text detection. Notably, the attentional mechanism module plays the most significant role, followed by the intermodal confrontation module. Consequently, this study’s model is well-suited for cross-modal graphic retrieval tasks.

Copyright © IEIE All right's reserved

No part of this publication may be reproduced or distributed in any form or any means, or stored in a data base or retrieval system, without the prior permission of the publisher(www.theieie.org).

ISSN : 2287-5255 (Online)