generated from academicpages/academicpages.github.io
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
12 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
--- | ||
title: "An Open Source Python Library for Anonymizing Sensitive Data" | ||
collection: publications | ||
permalink: /publication/2024-anjana | ||
date: 2024-11-26 | ||
venue: 'Scientific Data' | ||
paperurl: 'https://www.nature.com/articles/s41597-024-04019-z' | ||
citation: 'Sáinz-Pardo Díaz, J., López García, Á. An Open Source Python Library for Anonymizing Sensitive Data. Sci Data 11, 1289 (2024). https://doi.org/10.1038/s41597-024-04019-z' | ||
--- | ||
|
||
**Abstract** | ||
Open science is a fundamental pillar to promote scientific progress and collaboration, based on the principles of open data, open source and open access. However, the requirements for publishing and sharing open data are in many cases difficult to meet in compliance with strict data protection regulations. Consequently, researchers need to rely on proven methods that allow them to anonymize their data without sharing it with third parties. To this end, this paper presents the implementation of a Python library for the anonymization of sensitive tabular data. This framework provides users with a wide range of anonymization methods that can be applied on the given dataset, including the set of identifiers, quasi-identifiers, generalization hierarchies and allowed level of suppression, along with the sensitive attribute and the level of anonymity required. The library has been implemented following best practices for integration and continuous development, as well as the use of workflows to test code coverage based on unit and functional tests. |