Skip to content

Commit

Permalink
🧹 cleared ou a premature prod wf update and removed extra input
Browse files Browse the repository at this point in the history
📝 added readme
  • Loading branch information
migbro committed Oct 16, 2024
1 parent 49619a1 commit 9aae9f5
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 6 deletions.
29 changes: 29 additions & 0 deletions docs/KFDRC_GATK_HC_MOD_PLOIDY_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Kids First DRC GATK HaplotypeCaller Modified Ploidy BETA Workflow
This is a research workflow for users wishing to modify the ploidy of certain
regions of their existing GVCF calls.

## Inputs

- input_cram: Input CRAM file
- input_gvcf: GVCF generated in standard workflow
- biospecimen_name: String name of biospcimen
- output_basename: String to use as the base for output filenames
- reference_fasta: FASTA file that was used during alignment. Also need
corresponding `.fai` and `.dict` files.
- region: Specific region to pull, in format 'chr21' or 'chr3:1-1000'
- dbsnp_vcf: dbSNP vcf file
- dbsnp_idx: dbSNP vcf index file
- contamination: Precalculated contamination value. Providing the value here
will skip the run of VerifyBAMID and use the provided value as ground truth.
- contamination_sites_bed: .Bed file for markers used in this
analysis,format(chr\tpos-1\tpos\trefAllele\taltAllele)
- contamination_sites_mu: .mu matrix file of genotype matrix
- contamination_sites_ud: .UD matrix file from SVD result of genotype matrix
- re_calling_interval_list: Interval list to re-call
- wgs_evaluation_interval_list: Target intervals to restrict GVCF metric
analysis (for VariantCallingMetrics)
- sample_ploidy: If sample/interval is expected to not have ploidy=2, enter expected ploidy

## Outputs

- mixed_ploidy_gvcf: Updated complete GVCF in which the desired region has had its ploidy updated
8 changes: 4 additions & 4 deletions workflows/kfdrc-gatk-haplotypecaller-ploidy-mod-wf.cwl
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
cwlVersion: v1.2
class: Workflow
id: kfdrc-gatk-haplotypecaller-ploidy-mod-workflow
label: Kids First DRC GATK HaplotypeCaller Modified Ploidy Workflow
label: Kids First DRC GATK HaplotypeCaller Modified Ploidy BETA Workflow
doc: "This workflow re-runs a subset of regions with a different expected ploidy and re-integrates those results into existing results"

requirements:
Expand All @@ -17,8 +17,6 @@ inputs:
name: Homo_sapiens_assembly38.fasta, secondaryFiles: [{class: File, path: 60639016357c3a53540ca7af, name: Homo_sapiens_assembly38.fasta.fai}, {class: File, path: 60639019357c3a53540ca7e7,
name: Homo_sapiens_assembly38.dict}]},
secondaryFiles: ['.fai', '^.dict']}
reference_dict: {type: 'File?', "sbg:suggestedValue": {class: File, path: 60639019357c3a53540ca7e7,
name: Homo_sapiens_assembly38.dict}}
region: { type: 'string?', doc: "Specific region to pull, in format 'chr21' or 'chr3:1-1000'" }
dbsnp_vcf: {type: 'File', doc: "dbSNP vcf file", "sbg:suggestedValue": {class: File,
path: 6063901f357c3a53540ca84b, name: Homo_sapiens_assembly38.dbsnp138.vcf}}
Expand Down Expand Up @@ -80,7 +78,9 @@ steps:
output_basename: output_basename
dbsnp_vcf: dbsnp_vcf
dbsnp_idx: dbsnp_idx
reference_dict: reference_dict
reference_dict:
source: reference_fasta
valueFrom: "${self.secondaryFiles.filter(function(e) {return e.nameext == '.dict'})[0])}"
wgs_calling_interval_list: re_calling_interval_list
wgs_evaluation_interval_list: wgs_evaluation_interval_list
conditional_run:
Expand Down
2 changes: 0 additions & 2 deletions workflows/kfdrc-gatk-haplotypecaller-wf.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,6 @@ inputs:
class: File, path: 60639017357c3a53540ca7d3, name: wgs_evaluation_regions.hg38.interval_list}}
run_sex_metrics: {type: 'boolean?', default: false, doc: "idxstats will be collected\
\ and X/Y ratios calculated"}
sample_ploidy: { type: 'int?', doc: "If sample/interval is expected to not have ploidy=2, enter expected ploidy" }
outputs:
gvcf: {type: File, outputSource: generate_gvcf/gvcf}
gvcf_calling_metrics: {type: 'File[]', outputSource: generate_gvcf/gvcf_calling_metrics}
Expand Down Expand Up @@ -159,7 +158,6 @@ steps:
valueFrom: $(1)
contamination: contamination
biospecimen_name: biospecimen_name
sample_ploidy: sample_ploidy
out: [verifybamid_output, gvcf, gvcf_calling_metrics]
$namespaces:
sbg: https://sevenbridges.com
Expand Down

0 comments on commit 9aae9f5

Please sign in to comment.