SoC Theses
Permanent URI for this collection
Browse
Recent Submissions
Item Deep learning approaches for speech emotion recognition(Wichita State University, 2024-05) Srinivasan, Sriram; Kshirsagar, ShrutiThis thesis addresses the challenge of speech emotion recognition, focusing on contin- uous emotion estimation using deep learning techniques. Emotion detection plays a vital role in various domains, including healthcare, human-computer interaction, and affective com- puting. However, traditional approaches often struggle with accurately recognizing emotions across noise and reverberation, leading to limited diagnostic accuracy and applicability. To overcome these limitations, our study proposes a novel approach that integrates speech enhancement as a preprocessing step using advanced deep learning techniques. Our exper- imentation utilizes the AVEC 2018 challenge datasets, comprising audio/video recordings from diverse cultural backgrounds. The experimental pipeline involves several key components, including feature extrac- tion, model training, and data/speech enhancement techniques. We employ LSTM (Long Short-Term Memory) models for temporal dependency modeling and investigate the effec- tiveness of different hyperparameters, such as batch size, learning rate, and optimizer choice. We aim to evaluate the effectiveness of speech enhancement methods and explore the impact of various hyperparameters on emotion recognition performance. The results of our experi- ments demonstrate promising performance improvements when leveraging data/speech en- hancement techniques, such as single Spectral Enhancement (SSE) and Speech enhancement Generative adversarial network (SEGAN) show potential for capturing complex temporal relationships and contextual information, leading to enhanced emotion recognition capabilities. Overall, this research contributes to advancing the field of speech emotion recognition by providing insights into the effectiveness of different deep learning techniques and hyper- parameters. By improving emotion detection accuracy, our work lays the groundwork for future developments in healthcare monitoring technologies and human-computer interaction systems, ultimately enhancing patient outcomes and user experiences.Item Integrative framework for human-robot collaboration(Wichita State University, 2024-05) Jashti, Sai Lakshmi; Sawan, M. EdwinHuman-robot collaboration attracts increasing interest, but it is challenging for robots to understand complex environments and unstructured commands from humans. This paper proposes an HRC platform that bridges human workers and robot teammates via an interface surface that can be used to control robots, display robots intentions, and comprehend human commands. Our platform contains an interactive surface, and it is constructed by a projector and RGB-D camera that can convert any surface into tablet-liked surfaces, where you can control and communicate with the robot, thus creating a shared workspace that makes interaction intuitive. To help humans better understand robot’s intentions, we display the robot’s simulation on the interactive surface. In order to help humans know about the robot’s comprehension, the interactive surface can display the robot’s semantics understanding of the environments and logical representation of human commands. We evaluate the proposed platform in both simulated environments and real-world environments. We also evaluate the platform based on a survey that is collected from different groups of people. Keywords: Human-Robot collaboration (HRC), Scene semantics understanding, Digital twin, Natural language processing (NLP).Item Understanding and mitigating bias in AI, an application to biometrics(Wichita State University, 2024-05) Upendran Nair, Anoop Krishnan; Bagai, RajivBiometric systems, particularly those utilized in soft biometric analytics encompassing recognition and classification tasks, often demonstrate biases that disproportionately impact specific demographic groups, including gender and race. The mitigation of these biases is paramount to ensure fairness in algorithmic decision-making processes. However, prevailing bias mitigation techniques face constraints related to their limited generalizability, reliance on demographically annotated training data, and specific application constraints. Additionally, addressing bias typically involves a delicate balance between fairness and classification accuracy, where prioritizing fairness may result in diminished accuracy, especially for the most proficient demographic subgroups. To address this challenge, This dissertation investigates the presence of biases in soft biometric attribute classification algorithms and proposes innovative bias mitigation strategies aimed at enhancing fairness without compromising the overall performance of the system.Item Passive and proactive methods for facial forgery-based deepfake detection(Wichita State University, 2024-05) Nadimpalli, Aakash Varma; Rattani, AjitaSignificant advances in deep learning have obtained hallmark accuracy rates for various computer vision applications. However, advances in deep generative models have also led to the generation of very realistic fake content, also known as deepfakes, causing a threat to privacy, democracy, and national security. Most of the current deepfake detection methods are deemed as a binary classification problem in distinguishing authentic images or videos from fake ones using two-class convolutional neural networks (CNNs). These methods rely on identifying visual anomalies, temporal or color inconsistencies generated by deep generative models. Nonetheless, their effectiveness diminishes notably when assessed across datasets. To address this challenges, This dissertation introduces both passive and proactive approaches for detecting deepfakes based on facial forgery. These methods aim to narrow the disparities observed during deepfake detection while simultaneously improving performance and imperceptibility.Item Information-theoretic secret sharing: Fundamental limits and coding schemes via deep learning(Wichita State University, 2023-12) Rana, Vidhi; Chou, RémiThis dissertation aims to study and design coding schemes for information-theoretic security. We focus on two models: the secret sharing model and the Gaussian wiretap channel model. The main contribution of this dissertation is to take practical constraints into account. We consider a rate- limited public communication channel to account for bandwidth constraints and finite block length for practical applications requiring short packet length or low latency.