Vine copula statistical disclosure control for mixed-type data

In this paper, we develop a new statistical disclosure control (SDC) method for mixed-type data based on vine copulas. The use of Gaussian and skew-t copulas has been demonstrated to be capable of incorporating information from the marginal distributions of mixed-type variables, whether they are dis...

Full description

Saved in:
Bibliographic Details
Published in:Computational statistics & data analysis Vol. 176; p. 107561
Main Authors: Chu, Amanda M.Y., Ip, Chun Yin, Lam, Benson S.Y., So, Mike K.P.
Format: Journal Article
Language:English
Published: Elsevier B.V 01-12-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we develop a new statistical disclosure control (SDC) method for mixed-type data based on vine copulas. The use of Gaussian and skew-t copulas has been demonstrated to be capable of incorporating information from the marginal distributions of mixed-type variables, whether they are discrete or continuous. In particular, our proposed SDC method using vine copulas generalizes a data perturbation method using an extended skew-t copula. Our vine-SDC method improves the SDC method using the extended skew-t copula by allowing the bivariate copulas in the vine decomposition to take various forms, thus offering a better fit for the joint distribution of the data and more flexibility in data perturbation. An additional advantage of our vine-SDC method is the significant improvement in computational efficiency compared with that using the extended skew-t copula. We discuss some statistical properties of vine copulas and the methodology of vine-SDC. A simulation and a study of real healthcare survey data are provided to explore the performance and strength of vine-SDC and compare it with a common copula-based SDC method.
ISSN:0167-9473
1872-7352
DOI:10.1016/j.csda.2022.107561