Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High temp storage usage with samples_fillout_index_batch_workflow.cwl #106

Open
stevekm opened this issue Jul 12, 2022 · 1 comment
Open
Labels
bug Something isn't working high priority revisit this later non-critical, non-breaking change to consider

Comments

@stevekm
Copy link
Member

stevekm commented Jul 12, 2022

This workflow

https://github.com/mskcc/pluto-cwl/blob/master/cwl/samples_fillout_index_batch_workflow.cwl

and by extension its subworkflows

are showing very high temporary storage space usage. I am guessing it is due to the .bam indexing step in the workflow because I believe that step copies all input .bam files to temp staging dir.

When running Miller project with 288 samples, the work dir took up 1TB of space before any jobs started, then dropped down to about 20GB of space after jobs started running.

Need to keep an eye on this and potentially see if it can be fixed because it will eventually cause things to break on large runs where we dont have 1TB of free tmp space to rely on

@stevekm stevekm added bug Something isn't working high priority labels Jul 12, 2022
@stevekm
Copy link
Member Author

stevekm commented Aug 2, 2022

consider updating pluto/run-toil.sh to include a du background process that can track disk usage while workflow is running

@svural svural added the revisit this later non-critical, non-breaking change to consider label May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working high priority revisit this later non-critical, non-breaking change to consider
Projects
None yet
Development

No branches or pull requests

2 participants