Epsilon: Privacy Metric for Machine Learning Models

We introduce Epsilon*, a new privacy metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies. The metric requires only black-box access to model predictions, does not require training data re-sampling or model re-trainin...

Full description

Saved in:

Bibliographic Details
Main Authors:	Negoescu, Diana M, Gonzalez, Humberto, Orjany, Saad Eddin Al, Yang, Jilei, Lut, Yuliia, Tandra, Rahul, Zhang, Xiaowen, Zheng, Xinyi, Douglas, Zach, Nolkha, Vidita, Ahammad, Parvez, Samorodnitsky, Gennady
Format:	Journal Article
Language:	English
Published:	20-07-2023
Subjects:	Computer Science - Cryptography and Security Computer Science - Data Structures and Algorithms Computer Science - Learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	We introduce Epsilon, a new privacy metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies. The metric requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy. Epsilon is a function of true positive and false positive rates in a hypothesis test used by an adversary in a membership inference attack. We distinguish between quantifying the privacy loss of a trained model instance, which we refer to as empirical privacy, and quantifying the privacy loss of the training mechanism which produces this model instance. Existing approaches in the privacy auditing literature provide lower bounds for the latter, while our metric provides an empirical lower bound for the former by relying on an (${\epsilon}$, ${\delta}$)-type of quantification of the privacy of the trained model instance. We establish a relationship between these lower bounds and show how to implement Epsilon* to avoid numerical and noise amplification instability. We further show in experiments on benchmark public data sets that Epsilon* is sensitive to privacy risk mitigation by training with differential privacy (DP), where the value of Epsilon* is reduced by up to 800% compared to the Epsilon* values of non-DP trained baseline models. This metric allows privacy auditors to be independent of model owners, and enables visualizing the privacy-utility landscape to make informed decisions regarding the trade-offs between model privacy and utility.
AbstractList	We introduce Epsilon, a new privacy metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies. The metric requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy. Epsilon is a function of true positive and false positive rates in a hypothesis test used by an adversary in a membership inference attack. We distinguish between quantifying the privacy loss of a trained model instance, which we refer to as empirical privacy, and quantifying the privacy loss of the training mechanism which produces this model instance. Existing approaches in the privacy auditing literature provide lower bounds for the latter, while our metric provides an empirical lower bound for the former by relying on an (${\epsilon}$, ${\delta}$)-type of quantification of the privacy of the trained model instance. We establish a relationship between these lower bounds and show how to implement Epsilon* to avoid numerical and noise amplification instability. We further show in experiments on benchmark public data sets that Epsilon* is sensitive to privacy risk mitigation by training with differential privacy (DP), where the value of Epsilon* is reduced by up to 800% compared to the Epsilon* values of non-DP trained baseline models. This metric allows privacy auditors to be independent of model owners, and enables visualizing the privacy-utility landscape to make informed decisions regarding the trade-offs between model privacy and utility.
Author	Nolkha, Vidita Negoescu, Diana M Orjany, Saad Eddin Al Yang, Jilei Zheng, Xinyi Lut, Yuliia Tandra, Rahul Douglas, Zach Zhang, Xiaowen Ahammad, Parvez Gonzalez, Humberto Samorodnitsky, Gennady
Author_xml	– sequence: 1 givenname: Diana M surname: Negoescu fullname: Negoescu, Diana M – sequence: 2 givenname: Humberto surname: Gonzalez fullname: Gonzalez, Humberto – sequence: 3 givenname: Saad Eddin Al surname: Orjany fullname: Orjany, Saad Eddin Al – sequence: 4 givenname: Jilei surname: Yang fullname: Yang, Jilei – sequence: 5 givenname: Yuliia surname: Lut fullname: Lut, Yuliia – sequence: 6 givenname: Rahul surname: Tandra fullname: Tandra, Rahul – sequence: 7 givenname: Xiaowen surname: Zhang fullname: Zhang, Xiaowen – sequence: 8 givenname: Xinyi surname: Zheng fullname: Zheng, Xinyi – sequence: 9 givenname: Zach surname: Douglas fullname: Douglas, Zach – sequence: 10 givenname: Vidita surname: Nolkha fullname: Nolkha, Vidita – sequence: 11 givenname: Parvez surname: Ahammad fullname: Ahammad, Parvez – sequence: 12 givenname: Gennady surname: Samorodnitsky fullname: Samorodnitsky, Gennady
BackLink	https://doi.org/10.48550/arXiv.2307.11280$$DView paper in arXiv
BookMark	eNotzrFOwzAUhWEPMJTCA3TCL5Bwbce1062qSkFKBEP36Ma-BkvBqRxU0beHlk5n-XX03bGbNCZibCGgrKzW8IT5Jx5LqcCUQkgLM6a2hykOY1rx9xyP6E68pe8cHQ9j5i26z5iIN4Q5xfTB29HTMN2z24DDRA_XnbP983a_eSmat93rZt0UuDRQ9NiD1WC0AzLG1lYvldLBWS-9rcEJIKhNJau69154F1QI0gQT0Hr6C9WcPf7fXtTdIccvzKfurO8uevULL1xAVg
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by/4.0
DBID	AKY GOX
DOI	10.48550/arxiv.2307.11280
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2307_11280
GroupedDBID	AKY GOX
ID	FETCH-LOGICAL-a670-bab085075c0e7789856335fc8d2d890c10e0974249bdd1dcf3ff27f7fa8de35f3
IEDL.DBID	GOX
IngestDate	Wed Feb 14 12:49:57 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a670-bab085075c0e7789856335fc8d2d890c10e0974249bdd1dcf3ff27f7fa8de35f3
OpenAccessLink	https://arxiv.org/abs/2307.11280
ParticipantIDs	arxiv_primary_2307_11280
PublicationCentury	2000
PublicationDate	2023-07-20
PublicationDateYYYYMMDD	2023-07-20
PublicationDate_xml	– month: 07 year: 2023 text: 2023-07-20 day: 20
PublicationDecade	2020
PublicationYear	2023
Score	1.8903201
SecondaryResourceType	preprint
Snippet	We introduce Epsilon*, a new privacy metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Cryptography and Security Computer Science - Data Structures and Algorithms Computer Science - Learning
Title	Epsilon: Privacy Metric for Machine Learning Models
URI	https://arxiv.org/abs/2307.11280
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV09T8MwELVoJxYEAlQ-5YE1wrHr2OmGIKVLAYkO3SJ_okhVqRJawb_nLgmChdW-5Wzp_J7v7h0hN4Y7oyzQ1BQicDLWTEIcdGkiY-CSB-czgxnd2at6WuqHAmVy6E8vjKk_q12nD2ybW6xSxiYXDaR8wDmWbD0-L7vkZCvF1dv_2gHGbJf-PBLTQ3LQozt6113HEdkL62Miik1Trd7XE_pSVzvjvugcx1g5CniRzttixkB7ndM3isPJVs0JWUyLxf0s6WcVJCZTLLHGovabko4FpXSuZSYElkd57nUOzrPAALkD17Hep95FESNXUUWjfQBDcUqGQPfDiFADkMqkMmQoZqbHUjNvBRKfTOUWwMYZGbUelptOjqJE58vW-fP_ty7IPg5Kx19Jzi7J8KPehisyaPz2uj3Tb7dicyk
link.rule.ids	228,230,782,887
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Epsilon%3A+Privacy+Metric+for+Machine+Learning+Models&rft.au=Negoescu%2C+Diana+M&rft.au=Gonzalez%2C+Humberto&rft.au=Orjany%2C+Saad+Eddin+Al&rft.au=Yang%2C+Jilei&rft.date=2023-07-20&rft_id=info:doi/10.48550%2Farxiv.2307.11280&rft.externalDocID=2307_11280