A Comparative Analysis of CNN and CRNN Models for Home Emergency Sound Detection
محتوى المقالة الرئيسي
الملخص
The rise in single-person households underscores the critical need for reliable, privacy-preserving home monitoring systems. This paper presents a comprehensive comparative study between a Convolutional Neural Network (CNN) and a Convolutional Recurrent Neural Network (CRNN) for detecting domestic emergency sounds. A robust pipeline was implemented, involving the curation of a balanced dataset of normal and emergency sounds, extensive data augmentation, and feature extraction using Mel-Frequency Cepstral Coefficients (MFCCs). Counter to the theoretical expectation that CRNNs would excel at modeling temporal audio patterns, our experimental results demonstrate the clear superiority of the CNN model. The CNN achieved a remarkable accuracy of 98% and a weighted F1-score of 0.98, outperforming the CRNN (95% accuracy). Furthermore, the CNN exhibited faster convergence, greater training stability, and superior generalization. These findings indicate that for short-duration, spectrally distinct emergency sounds, the spatial feature extraction of CNNs is not only sufficient but more effective than explicit temporal modeling with CRNNs. The study concludes that the CNN architecture is the optimal choice for developing efficient and reliable audio-based emergency detection systems for resource-constrained smart home environments.