Data sanitization for t-Closeness over Multiple Numerical Sensitive Attributes
Bagai, Rajiv ; Weber, Eric ; Gowda, Vikas Thammanna
Bagai, Rajiv
Weber, Eric
Gowda, Vikas Thammanna
Other Names
Location
Time Period
Advisors
Original Date
Digitization Date
Issue Date
2023-09
Type
Article
Genre
Keywords
Anonymity,Data privacy,Data publishing,t-Closeness
Subjects (LCSH)
Citation
Bagai, R. (2023). Data Sanitization for t-Closeness over Multiple Numerical Sensitive Attributes. Transactions on Data Privacy,16(3), 191-210
Abstract
A popular technique for preserving privacy of individuals contained in any released data is to first sanitize the data according to the t-closeness principle. This principle requires partitioning rows of the original data into equivalence classes, in a way that the distribution of sensitive values in any class is sufficiently close, within a given threshold t, to their distribution in the original data. Most existing methods for constructing t-close equivalence classes consider just one sensitive attribute in the data, which is insufficient as many real-life datasets contain multiple sensitive at-tributes; partitioning attempts for multiple sensitive attributes have thus far been unsatisfactory. We present a method for generating t-close equivalence classes in the presence of multiple numerical sensitive attributes, where each such attribute has its own privacy threshold. The equivalence classes are generated in a way that minimizes information loss caused later by generalizing quasi identifier values within each class. While finding an optimal solution for this problem is known to be NP-hard, we show that our approach results in an acceptable solution in polynomial time.
Table of Contents
Description
Copyright by the authors.
Publisher
University of Skovde
Journal
Book Title
Series
Transactions on Data Privacy
v.16 no.3
v.16 no.3
Digital Collection
Finding Aid URL
Use and Reproduction
Archival Collection
PubMed ID
DOI
ISSN
1888-5063
2013-1631
2013-1631
