Supervised Speaker Diarization Using Random Forests: A Tool for Psychotherapy Process Research

Speaker diarization is the practice of determining who speaks when in audio recordings. Psychotherapy research often relies on labor intensive manual diarization. Unsupervised methods are available but yield higher error rates. We present a method for supervised speaker diarization based on random f...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in psychology Vol. 11; p. 1726
Main Authors: Fürer, Lukas, Schenk, Nathalie, Roth, Volker, Steppan, Martin, Schmeck, Klaus, Zimmermann, Ronan
Format: Journal Article
Language:English
Published: Frontiers Media S.A 28-07-2020
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Speaker diarization is the practice of determining who speaks when in audio recordings. Psychotherapy research often relies on labor intensive manual diarization. Unsupervised methods are available but yield higher error rates. We present a method for supervised speaker diarization based on random forests. It can be considered a compromise between commonly used labor-intensive manual coding and fully automated procedures. The method is validated using the EMRAI synthetic speech corpus and is made publicly available. It yields low diarization error rates (M: 5.61%, STD: 2.19). Supervised speaker diarization is a promising method for psychotherapy research and similar fields.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology
Edited by: Giuseppe Sartori, University of Padua, Italy
Reviewed by: Cristina Mazza, Sapienza University of Rome, Italy; Graziella Orrù, University of Pisa, Italy
ISSN:1664-1078
1664-1078
DOI:10.3389/fpsyg.2020.01726