Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: uab-cgds-worthey/quac
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 1.3
Choose a base ref
...
head repository: uab-cgds-worthey/quac
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
Loading
Showing with 12,897 additions and 1,440 deletions.
  1. +8 −2 .github/pull_request_template.md
  2. +1 −1 .github/workflows/action.yml
  3. +23 −0 .github/workflows/draft-pdf.yml
  4. +121 −0 .github/workflows/system_testing.yml
  5. +2 −0 .gitignore
  6. +1 −1 .test/README.md
  7. 0 .test/configs/include_priorQC/{ → pedigree}/project_1sample.ped
  8. 0 .test/configs/include_priorQC/{ → pedigree}/project_2samples.ped
  9. +3 −0 .test/configs/include_priorQC/sample_config/project_2samples_exome.tsv
  10. +3 −0 .test/configs/include_priorQC/sample_config/project_2samples_wgs.tsv
  11. 0 .test/configs/no_priorQC/{ → pedigree}/project_1sample.ped
  12. 0 .test/configs/no_priorQC/{ → pedigree}/project_2samples.ped
  13. +3 −0 .test/configs/no_priorQC/sample_config/project_2samples_exome.tsv
  14. +3 −0 .test/configs/no_priorQC/sample_config/project_2samples_wgs.tsv
  15. +10 −11 README.md
  16. +7 −0 configs/cli_cluster_config.json
  17. +0 −33 configs/env/bcftools.yaml
  18. +0 −27 configs/env/goleft.yaml
  19. +0 −30 configs/env/mosdepth.yaml
  20. +0 −85 configs/env/multiqc.yaml
  21. +0 −112 configs/env/picard.yaml
  22. +0 −111 configs/env/picard_smk.yaml
  23. +0 −88 configs/env/quac_watch.yaml
  24. +0 −147 configs/env/qualimap.yaml
  25. +0 −25 configs/env/samtools.yaml
  26. +0 −32 configs/env/verifyBamID.yaml
  27. +1 −1 configs/mkdocs/requirements.txt
  28. +1 −1 configs/multiqc_config_template.jinja2
  29. +8 −4 configs/{cluster_config.json → snakemake_cluster_config.json}
  30. +23 −28 configs/workflow.yaml
  31. +107 −2 docs/Changelog.md
  32. +11,202 −0 docs/example_output/multiqc_report.html
  33. +22 −0 docs/faq.md
  34. +9 −4 docs/index.md
  35. +40 −85 docs/input_output.md
  36. +0 −50 docs/installation.md
  37. +225 −0 docs/installation_configuration.md
  38. +36 −75 docs/quac_cli.md
  39. +51 −10 docs/quac_watch.md
  40. +0 −103 docs/reqts_configs.md
  41. +2 −1 docs/static_site.md
  42. +49 −50 docs/system_testing.md
  43. +10 −15 docs/visualize_pipeline.md
  44. +3 −3 mkdocs.yaml
  45. BIN paper/images/fig1_multiqc.png
  46. +114 −0 paper/paper.md
  47. +232 −0 paper/references.bib
  48. +40 −0 src/aggregate_sample_rename_configs.py
  49. +58 −0 src/read_sample_config.py
  50. +214 −109 src/run_quac.py
  51. +48 −0 src/setup_dependency_datasets.sh
  52. +32 −0 src/singularity_status.py
  53. +1 −0 src/slurm/slurm_profile/config.yaml
  54. +0 −6 workflow/Snakefile
  55. +70 −51 workflow/rules/aggregate_results.smk
  56. +56 −85 workflow/rules/common.smk
  57. +46 −40 workflow/rules/coverage_analysis.smk
  58. +2 −2 workflow/rules/relatedness_ancestry.smk
  59. +6 −6 workflow/rules/vcf_stats.smk
  60. +4 −4 workflow/rules/within_species_contamintation.smk
10 changes: 8 additions & 2 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
# Pull request

Please fill in the checklist below and comment as needed.
**Please add a clear description of what the PR is about:**

------

**Please fill in the checklist below and comment as needed:**

- [ ] Was code modified? Briefly describe.
- [ ] Was documentation modified? Briefly describe.
- [ ] Is this a bug-fix? Briefly describe.
- [ ] Is this a feature addition? Briefly describe.
- [ ] Did you modify QuaC-Watch config file? If so, did you modify multiqc template
`configs/multiqc_config_template.jinja2` and script `src/quac_watch/create_mutliqc_configs.py` as necessary?
- [ ] Did you perform system-level testing manually as described in master readme doc? Did it pass completely? If not why?
- [ ] Did you perform system-level testing manually, using `----cli_cluster_config` and `--snakemake_cluster_config`
options, as described in the [documentation](https://quac.readthedocs.io/en/stable/system_testing/)? Did it pass
completely? If not why?
- [ ] Updated `Changelog.md` file with change logs in recommended format?


2 changes: 1 addition & 1 deletion .github/workflows/action.yml
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@ on: push

jobs:
markdown-link-check:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@master
- uses: gaurav-nelson/github-action-markdown-link-check@v1
23 changes: 23 additions & 0 deletions .github/workflows/draft-pdf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
on: [push]

jobs:
paper:
runs-on: ubuntu-latest
name: Paper Draft
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Build draft PDF
uses: openjournals/openjournals-draft-action@master
with:
journal: joss
# This should be the path to the paper within your repo.
paper-path: paper/paper.md
- name: Upload
uses: actions/upload-artifact@v1
with:
name: paper
# This is the output path where Pandoc will write the compiled
# PDF. Note, this should be the same directory as the input
# paper.md
path: paper/paper.pdf
121 changes: 121 additions & 0 deletions .github/workflows/system_testing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
name: system_testing
on:
# push:
# paths:
# - ".github/workflows/system_testing.yml"
# - ".test/**"
# - "configs/**"
# - "src/**"
# - "workflow/**"
workflow_dispatch:

jobs:
system-testing:
name: System testing - QuaC
runs-on: ubuntu-20.04
defaults:
run:
shell: bash -l {0}

steps:
- name: Frees Disk Space (Ubuntu)
# For more info about this task, see https://github.com/uab-cgds-worthey/quac/issues/78
uses: jlumbroso/free-disk-space@v1.2.0
with:
# this might remove tools that are actually needed, when set to "true"
tool-cache: true

- name: Checkout repository
uses: actions/checkout@v2

- name: Create quac environment
uses: conda-incubator/setup-miniconda@v2
with:
mamba-version: "*"
channels: conda-forge,bioconda,defaults
auto-activate-base: false
activate-environment: quac
environment-file: configs/env/quac.yaml

- name: Check conda solution
run: |
mamba env export
- name: Check snakemake exists in conda env
run: |
which snakemake
snakemake --version
- uses: eWaterCycle/setup-singularity@v7
with:
singularity-version: 3.8.3

- name: Check singularity is working
run: |
singularity --version
- name: Set up dependencies for QuaC
run: |
bash src/setup_dependency_datasets.sh
- name: Run QuaC system testing - WGS mode AND no prior QC data
run: |
PROJECT_CONFIG="project_2samples"
PRIOR_QC_STATUS="no_priorQC"
USE_SLURM=""
python src/run_quac.py \
--project_name test_project \
--projects_path ".test/ngs-data/" \
--pedigree ".test/configs/${PRIOR_QC_STATUS}/${PROJECT_CONFIG}.ped" \
--outdir "data/quac/results/test_${PROJECT_CONFIG}_wgs-${PRIOR_QC_STATUS}/analysis" \
--quac_watch_config "configs/quac_watch/wgs_quac_watch_config.yaml" \
--workflow_config "configs/workflow.yaml" \
$USE_SLURM
- name: Run QuaC system testing - Exome mode AND no prior QC data
run: |
PROJECT_CONFIG="project_2samples"
PRIOR_QC_STATUS="no_priorQC"
USE_SLURM=""
python src/run_quac.py \
--project_name test_project \
--projects_path ".test/ngs-data/" \
--pedigree ".test/configs/${PRIOR_QC_STATUS}/${PROJECT_CONFIG}.ped" \
--outdir "data/quac/results/test_${PROJECT_CONFIG}_exome-${PRIOR_QC_STATUS}/analysis" \
--quac_watch_config "configs/quac_watch/exome_quac_watch_config.yaml" \
--workflow_config "configs/workflow.yaml" \
--exome \
$USE_SLURM
- name: Run QuaC system testing - WGS mode AND uses prior QC data
run: |
PROJECT_CONFIG="project_2samples"
PRIOR_QC_STATUS="include_priorQC"
USE_SLURM=""
python src/run_quac.py \
--project_name test_project \
--projects_path ".test/ngs-data/" \
--pedigree ".test/configs/${PRIOR_QC_STATUS}/${PROJECT_CONFIG}.ped" \
--outdir "data/quac/results/test_${PROJECT_CONFIG}_wgs-${PRIOR_QC_STATUS}/analysis" \
--quac_watch_config "configs/quac_watch/wgs_quac_watch_config.yaml" \
--include_prior_qc \
--allow_sample_renaming \
--workflow_config "configs/workflow.yaml" \
$USE_SLURM
- name: Run QuaC system testing - Exome mode AND uses prior QC data
run: |
PROJECT_CONFIG="project_2samples"
PRIOR_QC_STATUS="include_priorQC"
USE_SLURM=""
python src/run_quac.py \
--project_name test_project \
--projects_path ".test/ngs-data/" \
--pedigree ".test/configs/${PRIOR_QC_STATUS}/${PROJECT_CONFIG}.ped" \
--outdir "data/quac/results/test_${PROJECT_CONFIG}_exome-${PRIOR_QC_STATUS}/analysis" \
--quac_watch_config "configs/quac_watch/exome_quac_watch_config.yaml" \
--exome \
--include_prior_qc \
--allow_sample_renaming \
--workflow_config "configs/workflow.yaml" \
$USE_SLURM
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -86,3 +86,5 @@ logs/
# .java/fonts dir get created when creating fastqc conda env
.java/

# data retrieved for sys testing
.test/dependency_datasets
2 changes: 1 addition & 1 deletion .test/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Testing
# System Testing

Input directory structure to QuaC is based on the output directory structure of the [Small variant caller
pipeline](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/small_variant_caller_pipeline).
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sample_id bam vcf capture_bed fastqc_raw fastqc_trimmed fastq_screen dedup multiqc_rename_config
A .test/ngs-data/test_project/analysis/A/bam/A.bam .test/ngs-data/test_project/analysis/A/vcf/A.vcf.gz .test/ngs-data/test_project/analysis/A/configs/small_variant_caller/capture_regions.bed .test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-1-R1_screen.txt,.test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-1-R2_screen.txt,.test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-2-R1_screen.txt,.test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-2-R2_screen.txt .test/ngs-data/test_project/analysis/A/qc/dedup/A-1.metrics.txt,.test/ngs-data/test_project/analysis/A/qc/dedup/A-2.metrics.txt .test/ngs-data/test_project/analysis/A/qc/multiqc_initial_pass/multiqc_sample_rename_config/A_rename_config.tsv
B .test/ngs-data/test_project/analysis/B/bam/B.bam .test/ngs-data/test_project/analysis/B/vcf/B.vcf.gz .test/ngs-data/test_project/analysis/B/configs/small_variant_caller/capture_regions.bed .test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-1-R1_screen.txt,.test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-1-R2_screen.txt,.test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-2-R1_screen.txt,.test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-2-R2_screen.txt .test/ngs-data/test_project/analysis/B/qc/dedup/B-1.metrics.txt,.test/ngs-data/test_project/analysis/B/qc/dedup/B-2.metrics.txt .test/ngs-data/test_project/analysis/B/qc/multiqc_initial_pass/multiqc_sample_rename_config/B_rename_config.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sample_id bam vcf fastqc_raw fastqc_trimmed fastq_screen dedup multiqc_rename_config
A .test/ngs-data/test_project/analysis/A/bam/A.bam .test/ngs-data/test_project/analysis/A/vcf/A.vcf.gz .test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-raw/A-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/A/qc/fastqc-trimmed/A-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-1-R1_screen.txt,.test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-1-R2_screen.txt,.test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-2-R1_screen.txt,.test/ngs-data/test_project/analysis/A/qc/fastq_screen-trimmed/A-2-R2_screen.txt .test/ngs-data/test_project/analysis/A/qc/dedup/A-1.metrics.txt,.test/ngs-data/test_project/analysis/A/qc/dedup/A-2.metrics.txt .test/ngs-data/test_project/analysis/A/qc/multiqc_initial_pass/multiqc_sample_rename_config/A_rename_config.tsv
B .test/ngs-data/test_project/analysis/B/bam/B.bam .test/ngs-data/test_project/analysis/B/vcf/B.vcf.gz .test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-raw/B-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-1-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-1-R2_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-2-R1_fastqc.zip,.test/ngs-data/test_project/analysis/B/qc/fastqc-trimmed/B-2-R2_fastqc.zip .test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-1-R1_screen.txt,.test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-1-R2_screen.txt,.test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-2-R1_screen.txt,.test/ngs-data/test_project/analysis/B/qc/fastq_screen-trimmed/B-2-R2_screen.txt .test/ngs-data/test_project/analysis/B/qc/dedup/B-1.metrics.txt,.test/ngs-data/test_project/analysis/B/qc/dedup/B-2.metrics.txt .test/ngs-data/test_project/analysis/B/qc/multiqc_initial_pass/multiqc_sample_rename_config/B_rename_config.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sample_id bam vcf capture_bed
C .test/ngs-data/test_project/analysis/C/bam/C.bam .test/ngs-data/test_project/analysis/C/vcf/C.vcf.gz .test/ngs-data/test_project/analysis/C/configs/small_variant_caller/capture_regions.bed
D .test/ngs-data/test_project/analysis/D/bam/D.bam .test/ngs-data/test_project/analysis/D/vcf/D.vcf.gz .test/ngs-data/test_project/analysis/D/configs/small_variant_caller/capture_regions.bed
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sample_id bam vcf
C .test/ngs-data/test_project/analysis/C/bam/C.bam .test/ngs-data/test_project/analysis/C/vcf/C.vcf.gz
D .test/ngs-data/test_project/analysis/D/bam/D.bam .test/ngs-data/test_project/analysis/D/vcf/D.vcf.gz
21 changes: 10 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
[![Snakemake](https://img.shields.io/badge/snakemake-6.0.5-brightgreen.svg?style=flat)](https://snakemake.readthedocs.io)
[![ReadTheDocs](https://readthedocs.org/projects/quac/badge/?version=latest)](https://quac.readthedocs.io/en/stable/)

[![DOI JOSS](https://joss.theoj.org/papers/10.21105/joss.05313/status.svg)](https://doi.org/10.21105/joss.05313)
[![Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.10002036.svg)](https://doi.org/10.5281/zenodo.10002036)

# QuaC

🦆🦆 Don't duck that QC thingy 🦆🦆


> **_NOTE:_** In a past life, QuaC used a different remote Git management provider, [UAB
> Gitlab](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/public/quac). It was migrated to
> Github in Jan 2023, and the Gitlab version has been archived.

## What is QuaC?

QuaC is a snakemake-based pipeline that runs several QC tools for WGS/WES samples and then summarizes their results
@@ -28,31 +27,31 @@ In summary, QuaC performs the following:
- Optionally, above mentioned QuaC-Watch and QC aggregation steps can accept pre-run results from few QC tools (fastqc,
fastq-screen, picard's markduplicates) when run with flag `--include_prior_qc`.


> **_NOTE:_** QuaC is built to use with Human WGS/WES data. If you would like to use it with non-human data, please
> modify the pipeline as needed -- especially the thresholds used in QuaC-Watch configs.

## Documentation

Full documentation, including installation and how to run QuaC, is available at https://quac.readthedocs.io.
Full documentation, including installation and how to run QuaC, is available at <https://quac.readthedocs.io>.

## Citing QuaC

## Repo owner
If you use QuaC, please cite:

Gajapathy et al., (2023). QuaC: A Pipeline Implementing Quality Control Best Practices for Genome Sequencing and Exome Sequencing Data. Journal of Open Source Software, 8(90), 5313, <https://doi.org/10.21105/joss.05313>

* **Mana**valan Gajapathy
## Repo owner

- **Mana**valan Gajapathy

## License

[GNU GPLv3](./LICENSE)


## Contributing

See [here](./docs/CONTRIBUTING.md) for contributing guidelines.


## Changelog

See [here](./docs/Changelog.md)
See [here](./docs/Changelog.md)
7 changes: 7 additions & 0 deletions configs/cli_cluster_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"partition": "express",
"ntasks": "1",
"time": "02:00:00",
"cpus-per-task": "1",
"mem-per-cpu": "8G"
}
33 changes: 0 additions & 33 deletions configs/env/bcftools.yaml

This file was deleted.

27 changes: 0 additions & 27 deletions configs/env/goleft.yaml

This file was deleted.

30 changes: 0 additions & 30 deletions configs/env/mosdepth.yaml

This file was deleted.

Loading