Title |
A Robust and Efficient Road-view CCTV Video Violence Detection Method |
DOI |
https://doi.org/10.5573/ieie.2023.60.6.73 |
Keywords |
Real-time surveillance; Violence detection; Deep learning; CCTV; Action recognition |
Abstract |
In recent years, the rapid increase in CCTV deployments has driven many companies and research institutes to enhance public safety by effectively managing these systems. As a result, there is ongoing research focused on efficiently detecting violences in constrained hardware environments, such as CCTV edge modules. In this study, we propose a method for accurately detecting violence in real-time within complex, performance-limited road-view CCTV environments by combining traditional algorithm-based violence detection approaches with deep learning techniques. Our proposed method consists of an object detection deep learning network for identifying human objects, an object tracking network robust to occlusions and capable of assigning IDs to objects, and a violence detection algorithm that updates and applies interaction and action states of IDs to make the final detection of a violence. Considering robustness and efficiency for violence detection, we employ the YOLOv7-tiny model for the object detection deep learning network and ByteTrack for the object tracking network. By combining deep learning and traditional methods, our proposed method outperforms traditional approaches, such as Mahmoodi and Salajeghe's method utilizing optical flow intensity and directionality, and deep learning-based methods, such as Kang et al.'s, Halder and Chatterjee’s, and Abdali and Tuma's methods, which leverage spatial and temporal features of video data in the AIHub Anomalous Behavior CCTV dataset, with multiple objects engaging in various interactions. The proposed method achieves a lower computational load(5.1G FLOPs) and higher accuracy(77.6%). |