Mobile QR Code
Title Slim Multi-resolutional Frequency Dynamic Convolution for Sound Event Detection
Authors 편집부(Editor)
DOI https://doi.org/10.5573/ieie.2025.62.7.47
Page pp.47-55
ISSN 2287-5026
Keywords Sound event detection; Convolutional neural network; Frequency-aware convolution; Data augmentation; Median filter
Abstract This study aims to enhance acoustic event detection performance by comparing and analyzing the impact of various acoustic models, data augmentation techniques, and post-processing methods. Additionally, this paper proposes a slim multi-resolution frequency dynamic convolution model for acoustic event detection. The proposed model integrates Frequency Dynamic Convolution (FDC) and Multi-Resolution Kernel Convolution (MRKC) while incorporating model light weighting strategies to prevent overfitting. In experiments based on the DCASE2023 Task 4 baseline, the SMRFDC model achieved the best performance among existing models, with an event-based F1 score of 47.7% and an average of PSDSs of 52.06%. Compared to the baseline FDC model, the SMRFDC model demonstrated an improvement of 2.9% in the event-based F1 score and 1.2% in the average of PSDSs. When data augmentation and event-dependent post-processing were applied during model training, performance improvements were observed across all models. While SMRFDC and FDC exhibited differences in superiority depending on the type of acoustic event, both are comparable to each other in class-average event based F1 score. In conclusion, the proposed SMRFDC model demonstrates the best overall performance in acoustic event detection and maintains robust performance without additional optimization in data augmentation and post-processing. Future research will focus on developing a more robust acoustic event detection system for real-world environments.