Artificial Neural Networks Combined with the Principal Component Analysis for Non-Fluent Speech Recognition

The presented paper introduces principal component analysis application for dimensionality reduction of variables describing speech signal and applicability of obtained results for the disturbed and fluent speech recognition process. A set of fluent speech signals and three speech disturbances-block...

Full description

Saved in:

Bibliographic Details
Published in:	Sensors (Basel, Switzerland) Vol. 22; no. 1; p. 321
Main Authors:	Świetlicka, Izabela, Kuniszyk-Jóźkowiak, Wiesława, Świetlicki, Michał
Format:	Journal Article
Language:	English
Published:	Switzerland MDPI AG 01-01-2022 MDPI
Subjects:	Algorithms Analysis artificial neural networks Classification Eigenvalues Experiments Humans Multilayer perceptrons Network topologies Neural networks Neural Networks, Computer Pattern recognition systems Principal Component Analysis Principal components analysis Signal processing Speech Speech Perception Speech recognition Stuttering Variables Voice recognition Wavelet transforms Poland artificial neural networks stuttering speech recognition principal component analysis
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The presented paper introduces principal component analysis application for dimensionality reduction of variables describing speech signal and applicability of obtained results for the disturbed and fluent speech recognition process. A set of fluent speech signals and three speech disturbances-blocks before words starting with plosives, syllable repetitions, and sound-initial prolongations-was transformed using principal component analysis. The result was a model containing four principal components describing analysed utterances. Distances between standardised original variables and elements of the observation matrix in a new system of coordinates were calculated and then applied in the recognition process. As a classifying algorithm, the multilayer perceptron network was used. Achieved results were compared with outcomes from previous experiments where speech samples were parameterised with the Kohonen network application. The classifying network achieved overall accuracy at 76% (from 50% to 91%, depending on the dysfluency type).
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s22010321