Methodology for Training Small Domain-specific Language Models and Its Application in Service Robot Speech Interface

The proposed paper introduces the novel methodology for training small domain-specific language models only from domain vocabulary. Proposed methodology is intended for situations, when no training data are available and preparing of appropriate deterministic grammar is not trivial task. Methodology...

Full description

Saved in:
Bibliographic Details
Published in:Journal of Electrical and Electronics Engineering Vol. 7; no. 1; pp. 107 - 110
Main Authors: Ondas, Stanislav, Juhar, Jozef, Holcer, Roland
Format: Journal Article
Language:English
Published: Oradea University of Oradea 01-05-2014
Editura Universităţii din Oradea
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The proposed paper introduces the novel methodology for training small domain-specific language models only from domain vocabulary. Proposed methodology is intended for situations, when no training data are available and preparing of appropriate deterministic grammar is not trivial task. Methodology consists of two phases. In the first phase the "random" deterministic grammar, which enables to generate all possible combination of unigrams and bigrams is constructed from vocabulary. Then, prepared random grammar serves for generating the training corpus. The "random" n-gram model is trained from generated corpus, which can be adapted in second phase. Evaluation of proposed approach has shown usability of the methodology for small domains. Results of methodology assessment favor designed method instead of constructing the appropriate deterministic grammar.
ISSN:1844-6035
2067-2128
1844-6035