Methodology for Training Small Domain-specific Language Models and Its Application in Service Robot Speech Interface
The proposed paper introduces the novel methodology for training small domain-specific language models only from domain vocabulary. Proposed methodology is intended for situations, when no training data are available and preparing of appropriate deterministic grammar is not trivial task. Methodology...
Saved in:
Published in: | Journal of Electrical and Electronics Engineering Vol. 7; no. 1; pp. 107 - 110 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Oradea
University of Oradea
01-05-2014
Editura Universităţii din Oradea |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The proposed paper introduces the novel methodology for training small domain-specific language models only from domain vocabulary. Proposed methodology is intended for situations, when no training data are available and preparing of appropriate deterministic grammar is not trivial task. Methodology consists of two phases. In the first phase the "random" deterministic grammar, which enables to generate all possible combination of unigrams and bigrams is constructed from vocabulary. Then, prepared random grammar serves for generating the training corpus. The "random" n-gram model is trained from generated corpus, which can be adapted in second phase. Evaluation of proposed approach has shown usability of the methodology for small domains. Results of methodology assessment favor designed method instead of constructing the appropriate deterministic grammar. |
---|---|
ISSN: | 1844-6035 2067-2128 1844-6035 |