Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧠 memory hook for dedup #144

Merged
merged 3 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/KFDRC_SENTIEON_ALIGNMENT_GVCF_WORKFLOW_README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,12 @@ Metrics collection and contamination estimation are unchanged.
| Gather VCFs | Picard MergeVcfs | No splitting occurs in Sentieon |
| Metrics | Picard CollectVariantCallingMetrics | Picard CollectVariantCallingMetrics |

### Workflow Troubleshooting

- Sentieon tools scale up RAM usage to match allocated CPUs. If a task that is
running into memory issues, that can be solved by EITHER scaling UP the
task's allocated RAM and scaling DOWN the tasks allocated CPUs.

## Basic Info
- [D3b dockerfiles](https://github.com/d3b-center/bixtools)
- Testing Tools:
Expand Down
10 changes: 5 additions & 5 deletions tools/sentieon_ReadWriter.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,8 @@ requirements:
- class: ShellCommandRequirement
- class: InlineJavascriptRequirement
- class: ResourceRequirement
coresMin: |
$(inputs.cpu_per_job ? inputs.cpu_per_job : 16)
ramMin: |
$(inputs.mem_per_job ? inputs.mem_per_job : 16000)
coresMin: $(inputs.cpu_per_job)
ramMin: $(inputs.mem_per_job * 1000)
- class: DockerRequirement
dockerPull: pgc-images.sbgenomics.com/hdchen/sentieon:202112.01_hifi
- class: EnvVarRequirement
Expand Down Expand Up @@ -114,10 +112,12 @@ inputs:
label: CPU per job
doc: CPU per job
type: int?
default: 16
- id: mem_per_job
label: Memory per job
doc: Memory per job[MB].
doc: Memory per job[GB].
type: int?
default: 16

outputs:
- id: output_reads
Expand Down
30 changes: 8 additions & 22 deletions tools/sentieon_dedup.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -17,28 +17,8 @@ doc: |-
requirements:
- class: ShellCommandRequirement
- class: ResourceRequirement
coresMin: |-
${
if (inputs.cpu_per_job)
{
return inputs.cpu_per_job
}
else
{
return 32
}
}
ramMin: |-
${
if (inputs.mem_per_job)
{
return inputs.mem_per_job
}
else
{
return 32000
}
}
coresMin: $(inputs.cpu_per_job)
ramMin: $(inputs.mem_per_job * 1000)
- class: DockerRequirement
dockerPull: pgc-images.sbgenomics.com/hdchen/sentieon:202112.01_hifi
- class: EnvVarRequirement
Expand Down Expand Up @@ -108,6 +88,12 @@ inputs:
label: Basename for output files
doc: Basename for the output files that are to be written.
type: string?
- id: cpu_per_job
type: int?
default: 32
- id: mem_per_job
type: int?
default: 32

outputs:
- id: metrics_file
Expand Down
8 changes: 8 additions & 0 deletions workflows/kfdrc_sentieon_alignment_wf.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,10 @@ inputs:
bamtofastq_ram: {type: 'int?', default: 2, doc: "RAM in GB to allocate to bamtofastq"}
bwa_cpu: {type: 'int?', default: 36, doc: "CPUs to allocate to Sentieon BWA"}
bwa_ram: {type: 'int?', default: 72, doc: "RAM in GB to allocate to Sentieon BWA"}
dedup_cpu: {type: 'int?', default: 32, doc: "CPUs to allocate to Sentieon DeDup"}
dedup_ram: {type: 'int?', default: 32, doc: "RAM in GB to allocate to Sentieon DeDup"}
bam_to_cram_cpu: {type: 'int?', default: 16, doc: "CPUs to allocate to Sentieon BAM to CRAM"}
bam_to_cram_ram: {type: 'int?', default: 16, doc: "RAM in GB to allocate to Sentieon BAM to CRAM"}
run_t1k: {type: 'boolean?', default: true, doc: "Set to false to disable T1k HLA typing"}
hla_dna_ref_seqs: {type: 'File?', doc: "FASTA file containing the HLA allele reference sequences for DNA.", "sbg:suggestedValue": {
class: File, path: 6669ac8127374715fc3ba3c4, name: hla_v3.43.0_gencode_v39_dna_seq.fa}}
Expand Down Expand Up @@ -343,6 +347,8 @@ steps:
sentieon_license: sentieon_license
reference: untar_reference/indexed_fasta
in_alignments: sentieon_bwa_mem_payloads/realgn_bam
cpu_per_job: dedup_cpu
mem_per_job: dedup_ram
out: [metrics_file, out_alignments]
sentieon_bqsr:
run: ../tools/sentieon_bqsr.cwl
Expand Down Expand Up @@ -385,6 +391,8 @@ steps:
valueFrom: $(self.nameroot).cram
rm_cram_bai:
valueFrom: $(1 == 1)
cpu_per_job: bam_to_cram_cpu
mem_per_job: bam_to_cram_ram
out: [output_reads]
sentieon_hsmetrics:
run: ../tools/sentieon_HsMetricAlgo.cwl
Expand Down
Loading