Title |
Utilizing Context-aware Large Langauage Models for Speech Recognition Error Correction |
Authors |
임혜림(Hye-Lim Lim) ; 오창한(Changhan Oh) ; 강점자(Jeom-Ja Kang) ; 송화전(Hwa-Jeon Song) ; 박기영(Kiyoung Park) |
DOI |
https://doi.org/10.5573/ieie.2025.62.3.107 |
Keywords |
ASR; Error correction; Context; LLM; LoRA |
Abstract |
The accuracy of speech recognition in multi-speaker meeting environments is crucial for understanding and summarizing the content of meetings. However, speech recognition systems often experience performance degradation due to acoustic and linguistic factors, such as background noise and domain-specific terminology. To address these challenges, this study proposes a novel approach that enhances speech recognition performance by leveraging the rich linguistic knowledge of large language models (LLMs) to post-process recognition outputs containing errors, rather than relying solely on traditional methods optimized for specific domains. Specifically, we use recognized texts with errors and prior contextual information as inputs to the LLM, enabling the model to better predict what was actually spoken. Applying this approach to real-world speech data and speech recognition systems demonstrated a significant reduction in errors. Notably, experiments conducted on an English lecture corpus showed an error reduction rate of up to 16.1% compared to baseline models. Additionally, this paper presents real-world examples where the LLM effectively corrected speech recognition errors in context. |