Name	Name	Last commit message	Last commit date
parent directory ..
Ground_truth	Ground_truth
plots	plots
MHC.seqfile	MHC.seqfile
README.md	README.md
chop_graph.sh	chop_graph.sh
config.yaml	config.yaml
downloadPangenomeTools	downloadPangenomeTools
edit_distances.jpg	edit_distances.jpg
edlib_edits.py	edlib_edits.py
get_edit_stats.sh	get_edit_stats.sh
get_ids.py	get_ids.py
get_ids_2.py	get_ids_2.py
install_cactus.sh	install_cactus.sh
phi_vs_phi_ilp.jpg	phi_vs_phi_ilp.jpg
postprocessing.py	postprocessing.py
postprocessing_10.py	postprocessing_10.py
postprocessing_11.py	postprocessing_11.py
postprocessing_12.py	postprocessing_12.py
postprocessing_13.py	postprocessing_13.py
postprocessing_2.py	postprocessing_2.py
postprocessing_2_MIQP.py	postprocessing_2_MIQP.py
postprocessing_3.py	postprocessing_3.py
postprocessing_3_MILP.py	postprocessing_3_MILP.py
postprocessing_4.py	postprocessing_4.py
postprocessing_9.py	postprocessing_9.py
postprocessing_PG.py	postprocessing_PG.py
postprocessing_VG.py	postprocessing_VG.py
preprocess.py	preprocess.py
run_PG.py	run_PG.py
run_VG.py	run_VG.py
run_batch_1.py	run_batch_1.py
run_batch_1.sh	run_batch_1.sh
run_batch_10.py	run_batch_10.py
run_batch_10.sh	run_batch_10.sh
run_batch_11.py	run_batch_11.py
run_batch_11.sh	run_batch_11.sh
run_batch_12.py	run_batch_12.py
run_batch_12.sh	run_batch_12.sh
run_batch_13.py	run_batch_13.py
run_batch_13.sh	run_batch_13.sh
run_batch_2.py	run_batch_2.py
run_batch_2.sh	run_batch_2.sh
run_batch_3.py	run_batch_3.py
run_batch_3.sh	run_batch_3.sh
run_batch_3_miqp.py	run_batch_3_miqp.py
run_batch_3_miqp.sh	run_batch_3_miqp.sh
run_batch_4.py	run_batch_4.py
run_batch_4.sh	run_batch_4.sh
run_batch_4_miqp.py	run_batch_4_miqp.py
run_batch_4_miqp.sh	run_batch_4_miqp.sh
run_batch_5.py	run_batch_5.py
run_batch_5.sh	run_batch_5.sh
run_batch_5_milp.py	run_batch_5_milp.py
run_batch_5_milp.sh	run_batch_5_milp.sh
run_batch_6.py	run_batch_6.py
run_batch_6.sh	run_batch_6.sh
run_batch_6_milp.py	run_batch_6_milp.py
run_batch_6_milp.sh	run_batch_6_milp.sh
run_batch_7.py	run_batch_7.py
run_batch_7.sh	run_batch_7.sh
run_batch_8.py	run_batch_8.py
run_batch_8.sh	run_batch_8.sh
run_batch_9.py	run_batch_9.py
run_batch_9.sh	run_batch_9.sh
run_batch_PG.sh	run_batch_PG.sh
run_batch_VG.sh	run_batch_VG.sh
vg_haplotypes.py	vg_haplotypes.py

Name

Last commit message

Last commit date

Ground_truth

downloadPangenomeTools

postprocessing_2_MIQP.py

postprocessing_3.py

postprocessing_3_MILP.py

Reproduce Results

1. Prerequisites

Before starting the process, ensure the following tools and datasets are available:

Miniforge: Miniforge
Required Tools:
- edlib-aligner
- PHI
- VG
- PanGenie
- seqkit
- bcftools (with plugins)
- snakemake
- vcflib
- Gurobi
- gfa2gbwt

Ensure all dependencies are installed or available via conda.

2. Create Conda Environments

To manage dependencies, create separate conda environments for each tool. Here are example commands:

# Create environment and vcflib
conda create -n vcflib bioconda::vcflib -y

# Create environment for snakemake
conda create -n snakemake -c bioconda -c conda-forge snakemake=8.20.3 -y

# Install other dependencies for postprocessing
conda create -n edlib
conda activate edlib
conda install anconda::pip -y
pip3 install biopython
pip3 install edlib

3. Preprocessing

Download binary of gfa2gbwt, seqkit and PanGenie as well as export path to .bashrc

cd ..
./Installdeps
# Add extra/bin and extra/lib to .bashrc
echo 'export PATH="$(pwd)/extra/bin:$PATH"' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH="$(pwd)/extra/lib:$LD_LIBRARY_PATH"' >> ~/.bashrc
# export bcftools plugin
echo 'export BCFTOOLS_PLUGINS=$(pwd)/extra/plugins' >> ~/.bashrc
source ~/.bashrc
cd data

# Install seqkit
wget https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_amd64.tar.gz
tar -xvf seqkit_linux_amd64.tar.gz
mv seqkit ../extra/bin

# Install PanGenie
git clone https://github.com/eblerjana/pangenie.git  
cd pangenie  
conda env create -f environment.yml  
conda activate pangenie
conda install conda-forge::cereal -y
mkdir build; cd build; cmake .. ; make -j4
cp src/PanGenie ../../../extra/bin
cp src/PanGenie-index ../../../extra/bin
cp src/libPanGenieLib.so ../../../extra/lib
cd ../..
rm -rf pangenie

Note: Please remove the paths from .bashrc after the reproducing of the results are finished.

Download MHC haplotypes, build graph and subsample reads

python3 preprocess.py -t16
cd Ground_truth
gunzip *.gz
cd ..

4. Running ILP and IQP

For ILP and IQP, execute the following scripts:

# ILP execution (-b is batch size)
python3 run_batch_5_milp.py -b 2
python3 run_batch_6_milp.py -b 2


# ILP execution (no relaxation)
python3 run_batch_5.py -b 2
python3 run_batch_6.py -b 2

# IQP execution (-b is batch size)
python3 run_batch_3_miqp.py -b 2
python3 run_batch_4_miqp.py -b 2

# IQP execution (no relaxation)
python3 run_batch_3.py -b 2
python3 run_batch_4.py -b 2

5. Running Progressive Imputation

For progressive imputation, execute the following scripts:

python3 run_batch_9.py -b 2 # 13 haps
python3 run_batch_10.py -b 2 # 25 haps
python3 run_batch_11.py -b 2 # 49 haps
python3 run_batch_12.py -b 2 # 7 haps
python3 run_batch_13.py -b 2 # 3 haps

6. Run VG and PanGenie

# Running VG (-b is batch size)
python3 run_VG.py -b 2

# Running PanGenie
python3 run_PG.py -b 2

7. Postprocessing

After the runs are completed, extract relevant metrics:

# ILP 
python3 postprocessing_3_MILP.py 

# ILP (no relaxation)
python3 postprocessing_3.py 

# IQP 
python3 postprocessing_2_MIQP.py 

# IQP (no relaxation)
python3 postprocessing_2.py

# Progressive imputation
python3 postprocessing_9.py # 13 haps
python3 postprocessing_10.py # 25 haps
python3 postprocessing_11.py # 49 haps
python3 postprocessing_12.py # 7 haps
python3 postprocessing_13.py # 3 haps

# VG
python3 postprocessing_VG.py

# PanGenie
python3 postprocessing_PG.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Reproduce Results

1. Prerequisites

2. Create Conda Environments

3. Preprocessing

Download binary of gfa2gbwt, seqkit and PanGenie as well as export path to .bashrc

Download MHC haplotypes, build graph and subsample reads

4. Running ILP and IQP

5. Running Progressive Imputation

6. Run VG and PanGenie

7. Postprocessing

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Reproduce Results

1. Prerequisites

2. Create Conda Environments

3. Preprocessing

Download binary of gfa2gbwt, seqkit and PanGenie as well as export path to .bashrc

Download MHC haplotypes, build graph and subsample reads

4. Running ILP and IQP

5. Running Progressive Imputation

6. Run VG and PanGenie

7. Postprocessing