mathsf : Privacy-Preserving Integration and Sharing of Datasets
In privacy-enhancing technology, it has been inevitably challenging to strike a reasonable balance between privacy, efficiency, and usability (utility). To this, we propose a highly practical solution for the privacy-preserving integration and sharing of datasets among a group of participants. At th...
Saved in:
Published in: | IEEE transactions on information forensics and security Vol. 15; pp. 564 - 577 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
IEEE
2020
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In privacy-enhancing technology, it has been inevitably challenging to strike a reasonable balance between privacy, efficiency, and usability (utility). To this, we propose a highly practical solution for the privacy-preserving integration and sharing of datasets among a group of participants. At the heart of our solution is a new interactive protocol, <inline-formula> <tex-math notation="LaTeX">\mathsf{PrivateLink} </tex-math></inline-formula>. Through <inline-formula> <tex-math notation="LaTeX">\mathsf{PrivateLink} </tex-math></inline-formula>, each participant is able to randomize his/her dataset via an independent and untrusted third party, such that the resulting dataset can be merged with other randomized datasets contributed by other participants in a privacy-preserving manner. Our approach does not require key sharing among participants in order to integrate different datasets. This, in turn, leads to a user-friendly and scalable solution. Moreover, the correctness of a randomized dataset returned by the third party can be securely verified by the participant. We further demonstrate <inline-formula> <tex-math notation="LaTeX">\mathsf{PrivateLink} </tex-math></inline-formula>'s general utilities: using it to construct a structure-preserving data integration protocol. This is particularly useful for private, fine-grained integration of network traffic data. We state the security of our protocols under the well-established real-ideal simulation paradigm and demonstrate practicality by a prototype implementation on: 1) healthcare datasets and 2) DNS and NetFlow datasets. |
---|---|
ISSN: | 1556-6013 1556-6021 |
DOI: | 10.1109/TIFS.2019.2924201 |