Towards the Development of Automatic Speech Recognition for Bikol and Kapampangan

In this paper, we developed continuous speech recognition for Bikol and Kapampangan language using the CMU Sphinx Toolkit. The speech corpus that was collected by the researchers has duration of 7 hours from 150 native Kapampangan and Bikolano speakers. The speech corpus contains commonly used trave...

Full description

Saved in:
Bibliographic Details
Published in:2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM ) pp. 1 - 5
Main Authors: Liao, Edward Harold, Ganareal, Kim, Paguia, Chirstian Clarence, Agreda, Cesar, Octaviano, Manolito, Rodriguez, Ramon
Format: Conference Proceeding
Language:English
Published: IEEE 01-11-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we developed continuous speech recognition for Bikol and Kapampangan language using the CMU Sphinx Toolkit. The speech corpus that was collected by the researchers has duration of 7 hours from 150 native Kapampangan and Bikolano speakers. The speech corpus contains commonly used travelling phrases. This research explains how the researchers developed the speech recognition for local Philippine languages by using the traditional machine learning approach of Gaussian Mixture Model based on Hidden Markov Model. Experimentation of different tied states (senones) is integrated to this study. Experiment shows the highest obtained accuracy of 96.41% for Bikol Language; 96.49% for Kapampangan both hit at 600 tied states. While on the combined languages the highest accuracy gained is 95.39% at 1400 tied states.
DOI:10.1109/HNICEM48295.2019.9072783