Voice based gender recognition using deep learning
Predicting age through audio is essential for tailored user experiences, such as content recommendations and targeted advertisements. Better accessibility features for those with age-or gender-specific needs are also made possible by it. This paper presents a comprehensive approach to gender classif...
Saved in:
Published in: | 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT) pp. 1 - 8 |
---|---|
Main Authors: | , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
24-06-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Predicting age through audio is essential for tailored user experiences, such as content recommendations and targeted advertisements. Better accessibility features for those with age-or gender-specific needs are also made possible by it. This paper presents a comprehensive approach to gender classification using voice data, employing deep learning techniques such as Long Short Term Memory (LSTM), Convolutional Neyral Network (CNN), and Residual Network (ResNet) models. The work begins with the preprocessing of audio data, extracting Mel-Frequency Cepstral Coefficients (MFCCs) as features. To address class imbalance, Random Oversampling and Random Undersampling techniques are applied. Then, using the resampled data, LSTM, CNN, and ResNet models are built and trained. Model evaluation includes metrics such as classification reports, confusion matrices, and (Receiver Operating Characteristic Curve) ROC curves. The outcomes show how successful the suggested models are, with considerations for oversampling and undersampling strategies. The results, based on training the model on the SLR45 dataset, demonstrate an impressive accuracy of approximately 99% for gender classification by CNN. |
---|---|
ISSN: | 2473-7674 |
DOI: | 10.1109/ICCCNT61001.2024.10725158 |