Performance comparison of MFCC based bangla ASR system in presence and absence of third differential coefficients

Present Mel Frequency Cepstral Coefficient (MFCC) based Bangla Automatic Speech Recognition (ASR) systems are mostly implemented with delta and acceleration coefficients. With delta and acceleration coefficients of MFCC and the log energy, a vector set of 39 dimensions is obtained per 10ms. In this...

Full description

Saved in:
Bibliographic Details
Published in:2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT) pp. 1 - 6
Main Authors: Debnath, Sudipto, Fatema-E-Jannat, Saha, Susmita, Aziz, Mohammad Tarik, Sajol, Rifayet Hasan, Rahimi, Md Jakaria
Format: Conference Proceeding
Language:English
Published: IEEE 01-09-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Present Mel Frequency Cepstral Coefficient (MFCC) based Bangla Automatic Speech Recognition (ASR) systems are mostly implemented with delta and acceleration coefficients. With delta and acceleration coefficients of MFCC and the log energy, a vector set of 39 dimensions is obtained per 10ms. In this paper, our objective is to observe the effect of third differential coefficients on the performance of Bangla ASR, which is not explored in this field yet. In doing so, we have appended 13 third differential coefficients along with previous 39 coefficients to make a vector set of 52 coefficients per 10ms frame. We have observed the performance of Bangla ASR system in the presence and absence of third differential coefficients using Hidden Markov Model (HMM) based tied-state triphone model. To make the speech corpus, 100 sentences have been uttered by a different number of speakers at different phases including both male and female of similar ages in between 22-24. Hidden-Markov-Model Toolkit (HTK) has been used here for the comparative analysis. We have considered the Sentence Correction Rate (SCR) as the performance indicator. From the experiments, it has been observed that the MFCC based system of 39 (MFCC39) and 52 (MFCC52) dimensions have average SCR of 98.89% and 98.94% respectively. Therefore, our finding is that slight improvement is possible with the inclusion of third differential coefficients when the sampling data rate is as high as 44.1 KHz.
AbstractList Present Mel Frequency Cepstral Coefficient (MFCC) based Bangla Automatic Speech Recognition (ASR) systems are mostly implemented with delta and acceleration coefficients. With delta and acceleration coefficients of MFCC and the log energy, a vector set of 39 dimensions is obtained per 10ms. In this paper, our objective is to observe the effect of third differential coefficients on the performance of Bangla ASR, which is not explored in this field yet. In doing so, we have appended 13 third differential coefficients along with previous 39 coefficients to make a vector set of 52 coefficients per 10ms frame. We have observed the performance of Bangla ASR system in the presence and absence of third differential coefficients using Hidden Markov Model (HMM) based tied-state triphone model. To make the speech corpus, 100 sentences have been uttered by a different number of speakers at different phases including both male and female of similar ages in between 22-24. Hidden-Markov-Model Toolkit (HTK) has been used here for the comparative analysis. We have considered the Sentence Correction Rate (SCR) as the performance indicator. From the experiments, it has been observed that the MFCC based system of 39 (MFCC39) and 52 (MFCC52) dimensions have average SCR of 98.89% and 98.94% respectively. Therefore, our finding is that slight improvement is possible with the inclusion of third differential coefficients when the sampling data rate is as high as 44.1 KHz.
Author Debnath, Sudipto
Sajol, Rifayet Hasan
Rahimi, Md Jakaria
Fatema-E-Jannat
Saha, Susmita
Aziz, Mohammad Tarik
Author_xml – sequence: 1
  givenname: Sudipto
  surname: Debnath
  fullname: Debnath, Sudipto
  email: s.dnath91@gmail.com
  organization: Dept. of Electr. & Electron. Eng., Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
– sequence: 2
  surname: Fatema-E-Jannat
  fullname: Fatema-E-Jannat
  email: f.jannat29@gmail.com
  organization: Dept. of Electr. & Electron. Eng., Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
– sequence: 3
  givenname: Susmita
  surname: Saha
  fullname: Saha, Susmita
  email: susmita.eee34@gmail.com
  organization: Dept. of Electr. & Electron. Eng., Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
– sequence: 4
  givenname: Mohammad Tarik
  surname: Aziz
  fullname: Aziz, Mohammad Tarik
  email: imran1496@gmail.com
  organization: Dept. of Electr. & Electron. Eng., Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
– sequence: 5
  givenname: Rifayet Hasan
  surname: Sajol
  fullname: Sajol, Rifayet Hasan
  email: rifayet.014@gmail.com
  organization: Dept. of Electr. & Electron. Eng., Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
– sequence: 6
  givenname: Md Jakaria
  surname: Rahimi
  fullname: Rahimi, Md Jakaria
  email: mjrahimi@gmail.com
  organization: Dept. of Electr. & Electron. Eng., Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
BookMark eNotkM1OAjEUhWuiiYo8AZu-ANjb-Wm7JBNQEoxGcU3utLdaw3SwnQ1v7xjYnJ_Fdxbnnl3HPhJjMxALAGEem9Vq0-wWUkC9UFoVoqqv2NQoDZUwQhpR61s2zflHCAGm1lBWd-z3jZLvU4fRErd9d8QUch957_nLuml4i5ncqPHrgHz58c7zKQ_U8RD5MVGmfwyj49ie88gN3yE57oL3lCgOAQ_jMHkfbBhrfmA3Hg-ZphefsM_1atc8z7evT5tmuZ0HUNUwr50kCwZKbdGWRiqFAqzRzqiKXGtLaQqqSulQ1kIjSGm1ocKXDlvpoC0mbHbeDUS0P6bQYTrtL8cUf_KtXFA
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CEEICT.2016.7873056
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781509029068
1509029060
EndPage 6
ExternalDocumentID 7873056
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-6d2ec19148cac49277a01c98d975edbc4293e542da2608a122c89e3f4dab2d1b3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:47 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-6d2ec19148cac49277a01c98d975edbc4293e542da2608a122c89e3f4dab2d1b3
PageCount 6
ParticipantIDs ieee_primary_7873056
PublicationCentury 2000
PublicationDate 2016-Sept.
PublicationDateYYYYMMDD 2016-09-01
PublicationDate_xml – month: 09
  year: 2016
  text: 2016-Sept.
PublicationDecade 2010
PublicationTitle 2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT)
PublicationTitleAbbrev CEEICT
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001968145
Score 1.6794947
Snippet Present Mel Frequency Cepstral Coefficient (MFCC) based Bangla Automatic Speech Recognition (ASR) systems are mostly implemented with delta and acceleration...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Acceleration
Bangla
Feature extraction
Filter banks
Hidden Markov models
HMM
Mathematical model
Mel frequency cepstral coefficient
MFCC39
MFCC52
Speech
third differential coefficients
Title Performance comparison of MFCC based bangla ASR system in presence and absence of third differential coefficients
URI https://ieeexplore.ieee.org/document/7873056
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF5sT56qtuKbPXg0bfaRfRwlptSDUmwFb2UfUyxIqrX9_-4maYvgxUtYApPATJKZ2cz3fQjdSg1UWisTyueQcEploq0QCShpnQNHjI9A4dFEPr-phyLS5NztsDAAUA2fQT8uq3_5fuk2catsEB6uWPG2UEtqVWO19vspWijCs4ZYiKR6kBfFYz6N01ui31j-klCpMsiw8797H6HeHoqHx7skc4wOoDxBna0WA25ezS76Gu8RANjtxAXxco6fhnmOY7by4Rg1O_D95AXXFM54UeLPCoEUzEzpsbH1Otit3xcrj7cKKuFL8BEuDBXlRJy-6KHXYTHNR0kjp5AsQo2wToSnwfc69D_OOK6plCYlTiuvZQbeupCZGGScehN6HGUIpU5pYHPujaWeWHaK2uWyhDOEGWPCSp6xDFJuGQnBDmVBJmKzmDpHz1E3enD2WTNmzBrnXfx9-hIdxiDVk1tXqL1ebeAatb795qaK8Q9c-KjC
link.rule.ids 310,311,782,786,791,792,798,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0IHvSECsZv9-DRAt3d7naPppZABEIEE29kP4ZIYgoi_H932wIx8eKl2TSZNplpOzPbee8h9CAkEKG1CAibQcAIEYHUnAcQC20MmFBZDxTujsXwPX5OPU3O4w4LAwD58Bk0_TL_l28XZuO3ylru4fIVbwUdRkxwUaC19jsqkschi0pqobAtW0ma9pKJn9_izdL2l4hKnkM6tf_d_QQ19mA8PNqlmVN0ANkZqm3VGHD5ctbR12iPAcBmJy-IFzM86CQJ9vnKuqNX7cBP41dckDjjeYaXOQbJmanMYqWLtbNbf8xXFm81VNy34NNdGHLSCT9_0UBvnXSSdINSUCGYuyphHXBLnPel64CMMkwSIVQ7NDK2UkRgtXG5iULEiFWuy4lVSIiJJdAZs0oTG2p6jqrZIoMLhCmlXAsW0QjaTNPQhdsVBhH37WLbGHKJ6t6D02XBmTEtnXf19-l7dNSdDPrTfm_4co2OfcCKOa4bVF2vNnCLKt92c5fH-wdsB6wT
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+3rd+International+Conference+on+Electrical+Engineering+and+Information+Communication+Technology+%28ICEEICT%29&rft.atitle=Performance+comparison+of+MFCC+based+bangla+ASR+system+in+presence+and+absence+of+third+differential+coefficients&rft.au=Debnath%2C+Sudipto&rft.au=Fatema-E-Jannat&rft.au=Saha%2C+Susmita&rft.au=Aziz%2C+Mohammad+Tarik&rft.date=2016-09-01&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FCEEICT.2016.7873056&rft.externalDocID=7873056