Automated detection of Hainan gibbon calls for passive acoustic monitoring
Extracting species calls from passive acoustic recordings is a common preliminary step to ecological analysis. For many species, particularly those occupying noisy, acoustically variable habitats, the call extraction process continues to be largely manual, a time‐consuming and increasingly unsustain...
Saved in:
Published in: | Remote sensing in ecology and conservation Vol. 7; no. 3; pp. 475 - 487 |
---|---|
Main Authors: | , , , , , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Oxford
John Wiley & Sons, Inc
01-09-2021
Wiley |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Extracting species calls from passive acoustic recordings is a common preliminary step to ecological analysis. For many species, particularly those occupying noisy, acoustically variable habitats, the call extraction process continues to be largely manual, a time‐consuming and increasingly unsustainable process. Deep neural networks have been shown to offer excellent performance across a range of acoustic classification applications, but are relatively underused in ecology. We describe the steps involved in developing an automated classifier for a passive acoustic monitoring project, using the identification of calls of the Hainan gibbon Nomascus hainanus, one of the world's rarest mammal species, as a case study. This includes preprocessing—selecting a temporal resolution, windowing and annotation; data augmentation; processing—choosing and fitting appropriate neural network models; and post‐processing—linking model predictions to replace, or more likely facilitate, manual labelling. Our best model converted acoustic recordings into spectrogram images on the mel frequency scale, using these to train a convolutional neural network. Model predictions were highly accurate, with per‐second false positive and false negative rates of 1.5% and 22.3%. Nearly all false negatives were at the fringes of calls, adjacent to segments where the call was correctly identified, so that very few calls were missed altogether. A post‐processing step identifying intervals of repeated calling reduced an 8‐h recording to, on average, 22 min for manual processing, and did not miss any calling bouts over 72 h of test recordings. Gibbon calling bouts were detected regularly in multi‐month recordings from all selected survey points within Bawangling National Nature Reserve, Hainan. We demonstrate that passive acoustic monitoring incorporating an automated classifier represents an effective tool for remote detection of one of the world's rarest and most threatened species. Our study highlights the viability of using neural networks to automate or greatly assist the manual labelling of data collected by passive acoustic monitoring projects. We emphasize that model development and implementation be informed and guided by ecological objectives, and increase accessibility of these tools with a series of notebooks that allow users to build and deploy their own acoustic classifiers.
This study discusses the development of an automated classifier for the passive acoustic monitoring of Hainan gibbons, one of the world's rarest mammals. Applications of deep learning to ecology are increasingly popular. We believe two things make our paper different. Firstly, ours is the first to analyze data from a large‐scale acoustic monitoring project, comprising thousands of hours of recordings across multiple sites, collected as part of a large international collaboration and designed to answer a particular ecological question. Secondly, we have emphasised the processes involved in arriving at a final model, with the intention both of providing a realistic picture of what is involved, and of promoting reproducibility and the usability of these methods. Our manuscript is accompanied by a dataset of 600 h of recordings, which in itself is an important new resource both for gibbon researchers and for the further development of deep learning tools for acoustic monitoring. |
---|---|
Bibliography: | www.international.gc.ca Fieldwork was funded by an Arcus Foundation grant to STT and a Wildlife Acoustics grant to JVB. ID is supported in part by funding from the National Research Foundation of South Africa (Grant ID 90782, 105782). ED is supported by a postdoctoral fellowship from the African Institute for Mathematical Sciences South Africa, Stellenbosch University and the Next Einstein Initiative. This work was carried out with the aid of a grant from the International Development Research Centre, Ottawa, Canada Funding Information www.idrc.ca and with financial support from the Government of Canada, provided through Global Affairs Canada (GAC . |
ISSN: | 2056-3485 2056-3485 |
DOI: | 10.1002/rse2.201 |