Direct computational method of including piriform fossae and nasal cavity in a time-domain acoustic model of the vocal tract

Frequency-domain simulations of the human vocal tract (VT) have previously shown the importance of including the piriform fossae, which impart a pole and two zeros in the 4–5-kHz frequency range and thereby contribute to speaker individualities. The literature has also shown that time-domain simulat...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of the Acoustical Society of America Vol. 120; no. 5_Supplement; p. 3372
Main Author: Mokhtari, Parham
Format: Journal Article
Language:English
Published: 01-11-2006
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Frequency-domain simulations of the human vocal tract (VT) have previously shown the importance of including the piriform fossae, which impart a pole and two zeros in the 4–5-kHz frequency range and thereby contribute to speaker individualities. The literature has also shown that time-domain simulation of VT acoustics can result in high-quality synthesis naturally including interactions between the time-varying glottal area and the supraglottal VT. In the present work, the time-domain model of [S.Maeda, Speech Commun. 1, 199–229 (1982)] was extended to include both left and right piriform fossae as side-branches connected to the main VT, in addition to the nasal tract and sinuses. Departing from Maeda’s original implementation owing to the complexity of including more than one side branch, the variables representing acoustic pressure and volume velocity at the piriform fossae and nasal tract junctions were analytically eliminated, and the resulting large system of linear equations were solved simultaneously at each simulation sample. This direct method runs at only a few times real-time on a 1.8-GHz notebook PC, while achieving a more natural sound quality in speech synthesis and control over timbral (or voice quality) features that contribute to each speaker’s individuality. [Work supported by NiCT and SCOPE-R of Japan.]
ISSN:0001-4966
1520-8524
DOI:10.1121/1.4781566