Investigating speech enhancement towards robust synthetic audio spoofing detection in the wild

Loading...
Thumbnail Image
Authors
Anacin, Angela
Advisors
Kshirsagar, Shruti
Avila, Anderson
Issue Date
2025-04-11
Type
Abstract
Keywords
Research Projects
Organizational Units
Journal Issue
Citation
Anacin, A. 2025. Investigating speech enhancement towards robust synthetic audio spoofing detection in the wild. -- In Proceedings: 21st Annual Symposium on Graduate Research and Scholarly Projects. Wichita, KS: Wichita State University
Abstract

Logical access (LA) attacks involve the use of Text-to-Speech (TTS) or voice conversion (VC) techniques to generate spoofed speech data. This represents a serious threat to automatic speaker verification as intruders can use such attacks to bypass biometric security systems. In this study, we train a state-of-the-art model to distinguish between bonafide and spoofed speech samples, and we investigate its performance in the wild. For that, we used the LA data provided in the ASVspoof 2019 Challenge in the presence of different levels and types of background noises. We also explored two enhancement algorithms, namely SEGAN and MetricGAN+, to mitigate the detrimental effects of noisy speech. Results show that applying enhancement prior to the LA task can improve performance in more degraded scenarios. We also found that quality measures, such as PESQ, can be an important asset as indicator of enhancement algorithms performance.

Table of Contents
Description
Presented to the 21st Annual Symposium on Graduate Research and Scholarly Projects (GRASP) held at the Rhatigan Student Center, Wichita State University, April 11, 2025.
Research completed in the School of Computing, Wichita State University and the Department of Computer Science, INRS-Canada.
Publisher
Wichita State University
Journal
Book Title
Series
GRASP
v. 21
PubMed ID
DOI
ISSN
EISSN