Speaking to a common tune: Between-speaker convergence in voice fundamental frequency in a joint speech production task

Recent research on speech communication has revealed a tendency for speakers to imitate at least some of the characteristics of their interlocutor's speech sound shape. This phenomenon, referred to as phonetic convergence, entails a moment-to-moment adaptation of the speaker's speech targe...

Full description

Saved in:

Bibliographic Details
Published in:	PloS one Vol. 15; no. 5; p. e0232209
Main Authors:	Aubanel, Vincent, Nguyen, Noël
Format:	Journal Article
Language:	English
Published:	United States Public Library of Science 04-05-2020 Public Library of Science (PLoS)
Subjects:	Acoustics Amplitude (Acoustics) Biology and Life Sciences Children's stories Cognitive science Common ground Convergence Conveyors Engineering and Technology Frequency Fundamental frequency Identity Linguistics Materials handling equipment Oral reading Phonetics Physical Sciences Pitch Resonant frequencies Self concept Setting (Literature) Social Sciences Speaking Speech Speech perception Speech processing Speech production Speech sounds Time Tracking Variation Voice communication France Audio signal processing Computer software Speech signal processing Speech Phonology Acoustics Vowels Verbal communication
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recent research on speech communication has revealed a tendency for speakers to imitate at least some of the characteristics of their interlocutor's speech sound shape. This phenomenon, referred to as phonetic convergence, entails a moment-to-moment adaptation of the speaker's speech targets to the perceived interlocutor's speech. It is thought to contribute to setting up a conversational common ground between speakers and to facilitate mutual understanding. However, it remains uncertain to what extent phonetic convergence occurs in voice fundamental frequency (F0), in spite of the major role played by pitch, F0's perceptual correlate, as a conveyor of both linguistic information and communicative cues associated with the speaker's social/individual identity and emotional state. In the present work, we investigated to what extent two speakers converge towards each other with respect to variations in F0 in a scripted dialogue. Pairs of speakers jointly performed a speech production task, in which they were asked to alternately read aloud a written story divided into a sequence of short reading turns. We devised an experimental set-up that allowed us to manipulate the speakers' F0 in real time across turns. We found that speakers tended to imitate each other's changes in F0 across turns that were both limited in amplitude and spread over large temporal intervals. This shows that, at the perceptual level, speakers monitor slow-varying movements in their partner's F0 with high accuracy and, at the production level, that speakers exert a very fine-tuned control on their laryngeal vibrator in order to imitate these F0 variations. Remarkably, F0 convergence across turns was found to occur in spite of the large melodic variations typically associated with reading turns. Our study sheds new light on speakers' perceptual tracking of F0 in speech processing, and the impact of this perceptual tracking on speech production.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Competing Interests: The authors have declared that no competing interests exist.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0232209