Improving CNN-based solutions for emotion recognition using evolutionary algorithms
AI-based approaches, especially deep learning have made remarkable achievements in Speech Emotion Recognition (SER). Needless to say, Convolutional Neural Networks (CNNs) have been the backbone of many of these solutions. Although the use of CNNs have resulted in high performing models, building the...
Saved in:
Published in: | Results in applied mathematics Vol. 18; p. 100360 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier B.V
01-05-2023
Elsevier |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | AI-based approaches, especially deep learning have made remarkable achievements in Speech Emotion Recognition (SER). Needless to say, Convolutional Neural Networks (CNNs) have been the backbone of many of these solutions. Although the use of CNNs have resulted in high performing models, building them needs domain knowledge and direct human intervention. The same issue arises while improving a model. To solve this problem, we use techniques that were firstly introduced in Neural Architecture Search (NAS) and use a genetic process to search for models with improved accuracy. More specifically, we insert blocks with dynamic structures in between the layers of an already existing model and then use genetic operations (i.e. selection, mutation, and crossover) to find the best performing structures. To validate our method, we use this algorithm to improve architectures by searching on the Berlin Database of Emotional Speech (EMODB). The experimental results show at least 1.7% performance improvement in terms of Accuracy on EMODB test set. |
---|---|
ISSN: | 2590-0374 2590-0374 |
DOI: | 10.1016/j.rinam.2023.100360 |