Transfer Linear Subspace Learning for Cross-Corpus Speech Emotion Recognition
Speech emotion recognition has received an increasing interest in recent years, which is often conducted on the assumption that speech utterances in training and testing datasets are obtained under the same conditions. However, in reality, this assumption does not hold as the speech data are often c...
Saved in:
Published in: | IEEE transactions on affective computing Vol. 10; no. 2; pp. 265 - 275 |
---|---|
Main Author: | |
Format: | Journal Article |
Language: | English |
Published: |
Piscataway
IEEE
01-04-2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Speech emotion recognition has received an increasing interest in recent years, which is often conducted on the assumption that speech utterances in training and testing datasets are obtained under the same conditions. However, in reality, this assumption does not hold as the speech data are often collected from different devices or environments. Hence, there exists discrepancy between the training and testing data, which will have an adverse effect on recognition performance. In this paper, we examine the problem of cross-corpus speech emotion recognition. To address it, we present a novel transfer linear subspace learning (TLSL) framework to learn a common feature subspace for source and target datasets. In TLSL, a nearest neighbor graph algorithm is used to measure the similarity between different corpora, and a feature grouping strategy is introduced to divide the emotional features into two categories, i.e., high transferable part (HTP) versus low transferable part (LTP). To explore the proposed TLSL with different scenarios, we propose two kinds of TLSL approaches, called transfer unsupervised linear subspace learning (TULSL) and transfer supervised linear subspace learning (TSLSL), and provide the corresponding solutions for the optimization problems. Extensive experiments on several benchmark datasets validate the effectiveness of TLSL for cross-corpus speech emotion recognition. |
---|---|
ISSN: | 1949-3045 1949-3045 |
DOI: | 10.1109/TAFFC.2017.2705696 |