Skip to content

Commit

Permalink
ASyH release 1.0.0
Browse files Browse the repository at this point in the history
The Anonymous Synthesizer for Health Data features now a full pipeline to
produce a synthetic dataset from input data choosing the best fitting SDV
synthesizer model, according to similarity of data distribution in synthetic vs
input data.

The SDV synthesizers are initialised with heuristic settings to prevent over and
underfitting.  In the case of the GaussianCopulaSynthesizer, the best
distribution to fit each numerical data column is determined and finally applied
to initialise the synthesizer to be scored.

The resulting model can be saved to pkl and reused to produce more synthetic
data an demand.

ASyH 1.0.0 is based on SDV-1.0.0.

Optional reporting will give an overview of data fit and correlations between
the synthetic and original input data.
  • Loading branch information
TimJohann committed May 25, 2023
1 parent f58fb9b commit f394116
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 2 deletions.
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# ASyH - Anonymous Synthesizer for Health Data.
# ASyH - Anonymous Synthesizer for Health Data (Release 1).

## Overview

Expand Down Expand Up @@ -96,3 +96,8 @@ To run the tests set the PYTHONPATH and execute pytest on the 'tests' folder:

export PYTHONPATH=$(pwd)
pytest tests

## Release History
| Release | Date |
| ---: | ---: |
|1.0.0| 25/05/2023|
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = ['setuptools', 'wheel']

[project]
name = 'ASyH'
version = '0.0.1'
version = '1.0.0'
dependencies = [
'sdv == 1.0.0',
'openpyxl == 3.1.1',
Expand Down

0 comments on commit f394116

Please sign in to comment.