Phoneme-based or isolated-word modeling speech recognition system? An overview

In this paper speech theories and some methodological concerns about feature extraction and classification techniques widely used in speech recognition system are surveyed and discussed. The shortage of isolated word speech recognition is addressed as compared to its phoneme-based counterpart. This...

Full description

Saved in:

Bibliographic Details
Published in:	2011 IEEE 7th International Colloquium on Signal Processing and its Applications pp. 304 - 309
Main Authors:	Yusnita, M A, Paulraj, M P, Yaacob, S, Bakar, Shahriman Abu, Saidatul, A, Abdullah, Ahmad Nazri
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-03-2011
Subjects:	Artificial neural networks Feature extraction Hidden Markov models Isolated word Multilingual detection Phoneme Silicon Speaker independent Speech Speech processing Speech recognition Speech recognition system
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this paper speech theories and some methodological concerns about feature extraction and classification techniques widely used in speech recognition system are surveyed and discussed. The shortage of isolated word speech recognition is addressed as compared to its phoneme-based counterpart. This paper could be regarded as a very early stage towards methodology establishment in searching for better accuracy and less complexity system which has more generic properties. It is hoped that the system can classify speech regardless of the varieties across languages or accents. Speaker independency (SI) manner speech recognition system is required for this application and in fact, in many other potential applications as much as a telephonic network (large database consists of many different speakers) is a primary requirement. Isolated-word ASR for fixed vocabularies has been successfully implemented using HMM, ANN and SVM but suffers from lack of adaptability to other languages and increase in complexity as number of vocabularies increases. Conversely, phonemes, the smallest unit of human speech sounds are apparently more feasible to represent the basic building block for cross-language mapping. In fact, the phonetic transcription systems such as IPA and SAMPA are widely recognized and standardized for several languages in the world. This paper intends to investigate the phoneme-based potential as language independent phonetic units to overcome the lack of available training data so as to achieve a more generic speech recognizer.
ISBN:	1612844146 9781612844145
DOI:	10.1109/CSPA.2011.5759892