Skip to content

Simplistic synthetic dataset generation for OMOP CDM 5.4 testing

Notifications You must be signed in to change notification settings

d3london/omop_synthetic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generating simple synthetic OMOP CDM 5.4 tables

Important

This is a simple script to generate core OMOP tables with minimum levels of synthetic (fake) data, without any view towards realism. It is designed to support development of query and visualisation functionality, not for analysis. It is only populated with a small number of concept types (hardcoded in src/datagen.py) which are generated with a minimum level of clinical realism.

To use:

  1. Configure cohort size and number of visits in create_synthetic_omop.py

  2. Generate data: python create_synthetic_omop.py

  3. Validate OMOP constraints: duckdb -init validate_omop.sql. You will need to install the duckdb cli for this. If OMOP compatible, you should receive row counts for each table. Note that this does not yet validate against the vocabulary table.

About

Simplistic synthetic dataset generation for OMOP CDM 5.4 testing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages