Deep learning approaches for speech emotion recognition
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
This thesis addresses the challenge of speech emotion recognition, focusing on contin- uous emotion estimation using deep learning techniques. Emotion detection plays a vital role in various domains, including healthcare, human-computer interaction, and affective com- puting. However, traditional approaches often struggle with accurately recognizing emotions across noise and reverberation, leading to limited diagnostic accuracy and applicability. To overcome these limitations, our study proposes a novel approach that integrates speech enhancement as a preprocessing step using advanced deep learning techniques. Our exper- imentation utilizes the AVEC 2018 challenge datasets, comprising audio/video recordings from diverse cultural backgrounds. The experimental pipeline involves several key components, including feature extrac- tion, model training, and data/speech enhancement techniques. We employ LSTM (Long Short-Term Memory) models for temporal dependency modeling and investigate the effec- tiveness of different hyperparameters, such as batch size, learning rate, and optimizer choice. We aim to evaluate the effectiveness of speech enhancement methods and explore the impact of various hyperparameters on emotion recognition performance. The results of our experi- ments demonstrate promising performance improvements when leveraging data/speech en- hancement techniques, such as single Spectral Enhancement (SSE) and Speech enhancement Generative adversarial network (SEGAN) show potential for capturing complex temporal relationships and contextual information, leading to enhanced emotion recognition capabilities. Overall, this research contributes to advancing the field of speech emotion recognition by providing insights into the effectiveness of different deep learning techniques and hyper- parameters. By improving emotion detection accuracy, our work lays the groundwork for future developments in healthcare monitoring technologies and human-computer interaction systems, ultimately enhancing patient outcomes and user experiences.