Transfer Linear Subspace Learning for Cross-Corpus Speech Emotion Recognition

Speech emotion recognition has received an increasing interest in recent years, which is often conducted on the assumption that speech utterances in training and testing datasets are obtained under the same conditions. However, in reality, this assumption does not hold as the speech data are often c...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on affective computing Vol. 10; no. 2; pp. 265 - 275
Main Author:	Song, Peng
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01-04-2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithm design and analysis Algorithms Datasets dimensionality reduction Emotion recognition Learning Optimization Principal component analysis Speech Speech recognition subspace learning Subspaces Testing Training transfer learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Speech emotion recognition has received an increasing interest in recent years, which is often conducted on the assumption that speech utterances in training and testing datasets are obtained under the same conditions. However, in reality, this assumption does not hold as the speech data are often collected from different devices or environments. Hence, there exists discrepancy between the training and testing data, which will have an adverse effect on recognition performance. In this paper, we examine the problem of cross-corpus speech emotion recognition. To address it, we present a novel transfer linear subspace learning (TLSL) framework to learn a common feature subspace for source and target datasets. In TLSL, a nearest neighbor graph algorithm is used to measure the similarity between different corpora, and a feature grouping strategy is introduced to divide the emotional features into two categories, i.e., high transferable part (HTP) versus low transferable part (LTP). To explore the proposed TLSL with different scenarios, we propose two kinds of TLSL approaches, called transfer unsupervised linear subspace learning (TULSL) and transfer supervised linear subspace learning (TSLSL), and provide the corresponding solutions for the optimization problems. Extensive experiments on several benchmark datasets validate the effectiveness of TLSL for cross-corpus speech emotion recognition.
ISSN:	1949-3045 1949-3045
DOI:	10.1109/TAFFC.2017.2705696