Implementation of Constant-Q Transform (CQT) and Mel Spectrogram to converting Bird's Sound
Classification of bird sounds can be done in various methods and ways. One method that can be used is CNN (Convolutional Neural Network). CNN is an algorithm used for image classification. For bird sounds to be classified by CNN, conversion from analogue sound to digital images is required objective...
Saved in:
Published in: | 2021 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT) pp. 52 - 56 |
---|---|
Main Authors: | , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
17-07-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Classification of bird sounds can be done in various methods and ways. One method that can be used is CNN (Convolutional Neural Network). CNN is an algorithm used for image classification. For bird sounds to be classified by CNN, conversion from analogue sound to digital images is required objectively and accurately. This study will discuss the conversion of analogue sound from birds into spectrogram images using one of Constant-Q Transform (CQT) and Mel Spectrogram. Bird voices are recorded using a voice recorder. The recorded voice will represent the audio signal digitally. Constant-Q Transform will map the audio signal from a time domain to a frequency domain. The frequency will be converted into a log scale and the colour dimensions (amplitude) into decibels to form a spectrogram. The spectrogram will be mapped on a mel scale to form a mel spectrogram. This research is the change of bird's voice analogously to mel spectrogram, classified in CNN. The resulting images from this study can be classified using CNN to help classify bird sounds. |
---|---|
DOI: | 10.1109/COMNETSAT53002.2021.9530779 |