This repository holds code used to generate simulated pedigrees for testing / measuring PRIMUS and other tools. It works with a data source, originally hapmap3, and will simulate pedigrees by choosing founders from the datasource and then simulating offspring.
After simulating pedigrees, the code will knock out specific members of the pedigree to allow reconstruction from partial pedigrees.
Run build_dependencies.sh
from this project root directory to compile tools this script uses. You'll need a C++ compiler.
Run the script updated_main.pl
in src/original files
like:
perl updated_main.pl 1 uniform3 40
. Arguments are:
- 1 - simulation number
- uniform3 - type of simulation (uniform3 = every mating has 3 kids)
- 40 - size of simulation
The entry point / main script is perl, and has some external dependencies to install from CPAN:
- Math::Random::MT::Auto
- Math::Random
- Math::Combinatorics
Internal include the PRIMUS project from https://primus.gs.washington.edu/primusweb/. I'm checking in just the perl modules from that project, into dependencies/PRIMUSv1.9.0/lib/perl_modules
PRIMUS brings with it packages from CPAN, also checked in:
- File::
- Getopt::Long
- Statistics::Distributions
Internal dependencies also include vcf2ped.jar, checked into this project as well.
We also have the python simulatino model, from IBDsims. The code as distributed from github is checked into src/original-files/simulation-code/from-github
. This is the code that will actually produce the simulated pedigrees, the perl main script calls this code, and then knocks out some individuals to remove them from the pedigress, then attempts and measures reconstruction.
To run the main script, we need some external tools too, which are assumed to be on the path and callable from shell:
A copy of Cranefoot is distributed with PRIMUS and included here, but it's pretty old. We have our own mirror of cranefoot sources at https://github.com/belowlab/cranefoot
src/original-files/main_v3.pl
is what grahame has been working to modify. I'm working to make it take args for where to write output, and write output into the test-output
directory.