Low-complexity smartphone-based acoustic event detection to assist individuals with hearing impairments
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
Hearing loss is the third most common chronic physical condition in the United States and is twice as prevalent as diabetes or cancer. Approximately 15% of American adults (37.5 million) aged 18 and over report some trouble hearing. This impairment significantly heightens the risk of accidents, with those afflicted being nearly twice as likely to suffer accidental injuries compared to those with excellent or good hearing. To address these challenges, assistive technologies such as hearing aids and acoustic event detection devices play a pivotal role in preventing accidents and enhancing environmental awareness among individuals with hard of hearing, thereby contributing to an overall improvement in their quality of life. In recent times, there has been a surge in interest and advancements in machine learning, particularly in deep learning, leading to significant progress in the field of acoustic event detection. The models, however, are extremely complex and can rely on large neural networks either stored on the device or on the cloud. This can result in increases in data consumption, which can become expensive, and in large energy consumption, thus draining batteries quicker and reducing the lifespan of the smartphone. In this work, we describe the development of the Lisnen mobile app, a low-complexity acoustic event detection algorithm targeted to individuals who are hard-of-hearing. Based on consultations with target users, four specific acoustic events have been deemed important: door knock, door bell, fire alarm, sirens, car horns, and baby cries. Our primary objective is to create event detectors characterized by low complexity, capable of running directly on smartphones with minimal energy consumption. Our approach involves leveraging Convolutional Neural Network (CNN) architecture, renowned for its effectiveness in identifying sounds within a user's environment and triggering alarms as necessary. Initially, the audio signal undergoes transformation into an appropriate format, such as spectrograms and Mel-frequency cepstral coefficients serving as key preprocessing techniques. These transformed features are then fed into the CNN model, which analyzes them to forecast the class to which the audio sample belongs. CNNs excel in extracting hierarchical characteristics from audio sources, particularly in analyzing time-frequency representations, thereby enabling efficient acoustic event detection. By combining advanced machine learning techniques with usercentric design principles, our research aims to empower individuals with hearing challenges, enhancing their safety and independence in navigating their surroundings.
Table of Contents
Description
Research completed in the Department of Computer Science, College of Engineering.
Publisher
Journal
Book Title
Series
v. 20