Prodorshok I: A bengali isolated speech dataset for voice-based assistive technologies: A comparative analysis of the effects of data augmentation on HMM-GMM and DNN classifiers

Prodorshok I is a Bengali isolated word dataset tailored to help create speaker-independent, voice-command driven automated speech recognition (ASR) based assistive technologies to help improve human-computer interaction (HCI). This paper presents the results of an objective analysis that was undert...

Full description

Saved in:
Bibliographic Details
Published in:2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC) pp. 396 - 399
Main Authors: Reza, Mohi, Rashid, Warida, Mostakim, Moin
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2017
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Prodorshok I is a Bengali isolated word dataset tailored to help create speaker-independent, voice-command driven automated speech recognition (ASR) based assistive technologies to help improve human-computer interaction (HCI). This paper presents the results of an objective analysis that was undertaken using a subset of words from Prodorshok I to assess its reliability in ASR systems that utilize Hidden Markov Models (HMM) with Gaussian emissions and Deep Neural Networks (DNN). The results show that simple data augmentation involving a small pitch shift can make surprisingly tangible improvements to accuracy levels in speech recognition.
ISSN:2572-7621
DOI:10.1109/R10-HTC.2017.8288983