Evaluating machine learning architectures for sound event detection for signals with variable signal-to-noise-ratios in the Beaufort Sea

This paper explores the challenging polyphonic sound event detection problem using machine learning architectures applied to data recorded in the Beaufort Sea during the Canada Basin Acoustic Propagation Experiment. Four candidate architectures were investigated and evaluated on nine classes of sign...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of the Acoustical Society of America Vol. 154; no. 4; pp. 2689 - 2707
Main Authors: Ibrahim, Malek, Sagers, Jason D., Ballard, Megan S., Le, Minh, Koutsomitopoulos, Vasilis
Format: Journal Article
Language:English
Published: 01-10-2023
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper explores the challenging polyphonic sound event detection problem using machine learning architectures applied to data recorded in the Beaufort Sea during the Canada Basin Acoustic Propagation Experiment. Four candidate architectures were investigated and evaluated on nine classes of signals broadcast from moored sources that were recorded on a vertical line array of hydrophones over the course of the yearlong experiment. These signals represent a high degree of variability with respect to time-frequency characteristics, changes in signal-to-noise ratio (SNR) associated with varying signal levels as well as fluctuating ambient sound levels, and variable distributions, which resulted in class imbalances. Within this context, binary relevance, which decomposes the multi-label learning task into a number of independent binary learning tasks, was examined as an alternative to the conventional multi-label classification (MLC) approach. Binary relevance has several advantages, including flexible, lightweight model configurations that support faster model inference. In the experiments presented, binary relevance outperformed conventional MLC approach on classes with the most imbalance and lowest SNR. A deeper investigation of model performance as a function of SNR showed that binary relevance significantly improved recall within the low SNR range for all classes studied.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0001-4966
1520-8524
DOI:10.1121/10.0021974