Generating t-Closed Partitions of Datasets with Multiple Sensitive Attributes

No Thumbnail Available
Issue Date
Gowda, Vikas Thammanna
Bagai, Rajiv

Gowda, V. T. & Bagai, R. (2023). "Generating t-Closed Partitions of Datasets with Multiple Sensitive Attributes," 7th International Conference on Cryptography, Security and Privacy (CSP), Tianjin, China, 2023, pp. 107-111, doi: 10.1109/CSP58884.2023.00024.


The popular t-closeness privacy model requires the "distance" between the distribution of sensitive attribute values in any given raw dataset and their distribution in every equivalence class created to not exceed some privacy threshold t. While most existing methods for achieving t-closeness handle data with just a single sensitive attribute, datasets with multiple sensitive attributes are very common in the real world. Here we demonstrate a technique for creating equivalence classes from a dataset containing multiple sensitive attributes. The equivalence classes generated by our method satisfy t-closeness without taking any values as input. While generalization of quasi-identifier attributes leads to information loss, the size of generated classes is roughly identical and differs by at most one, which results in a lower information loss. Generating classes with minimum information loss for a given value of is NP-hard, the equivalence classes generated by our method takes O(r log r) time.

Table of Content
Click on the DOI link to access this conferece paper (may not be free).