Methods of countering speech synthesis attacks on voice biometric systems in banking

The paper considers methods of countering speech synthesis attacks on voice biometric systems in banking. Voicebiometrics security is a large-scale problem significantly raised over the past few years. Automatic speaker verificationsystems (ASV) are vulnerable to various types of spoofing attacks: i...

Full description

Saved in:
Bibliographic Details
Published in:Nauchno-tekhnicheskiĭ vestnik informat͡s︡ionnykh tekhnologiĭ, mekhaniki i optiki Vol. 21; no. 1; pp. 109 - 117
Main Authors: Kuznetsov, A.Yu, Murtazin, R.A., Garipov, I.M., Fedorov, E.A., Kholodenina, A.V., Vorobeva, A.A.
Format: Journal Article
Language:English
Published: Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University) 01-02-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The paper considers methods of countering speech synthesis attacks on voice biometric systems in banking. Voicebiometrics security is a large-scale problem significantly raised over the past few years. Automatic speaker verificationsystems (ASV) are vulnerable to various types of spoofing attacks: impersonation, replay attacks, voice conversion, and speech synthesis attacks. Speech synthesis attacks are the most dangerous as the technologies of speech synthesisare developing rapidly (GAN, Unit selection, RNN, etc.). Anti-spoofing approaches can be based on searching forphase and tone frequency anomalies appearing during speech synthesis and on a preliminary knowledge of the acoustic differences of specific speech synthesizers. ASV security remains an unsolved problem, because there is no universalsolution that does not depend on the speech synthesis methods used by the attacker. In this paper, we provide the analysisof existing speech synthesis technologies and the most promising attacks detection methods for banking and financialorganizations. Identification features should include emotional state and cepstral characteristics of voice. It is necessary to adjust the user’s voiceprint regularly. Analyzed signal should not be too smooth and containing unnatural noisesor sharp interruptions changes in the signal level. Analysis of speech intelligibility and semantics are also important.Dynamic passwords database should contain words that are difficult to synthesize and pronounce. The proposed approach could be used for design and development of authentication systems for banking and financial organizations resistantto speech synthesis attacks.
ISSN:2226-1494
2500-0373
DOI:10.17586/2226-1494-2021-21-1-109-117