Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

freeze at uv:search_protein_db #11

Open
i-artjom opened this issue Jan 24, 2023 · 8 comments
Open

freeze at uv:search_protein_db #11

i-artjom opened this issue Jan 24, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@i-artjom
Copy link

Hi :)

After installing and downloading the test data and databases the workflow seems to freeze at uv:search_protein_db (stops for hours, several times tried).

nextflow run main.nf --standalone --results results --uvdb db --genomes metadata.csv --annotate true
N E X T F L O W  ~  version 22.10.4
Launching `main.nf` [lethal_bartik] DSL2 - revision: af858ed795
executor >  local (8)
[51/3d6887] process > uv:minlen (2)                      [100%] 2 of 2 ✔
[2a/35bd49] process > uv:rename_contigs (2)              [100%] 2 of 2 ✔
[26/070222] process > uv:reading_frames (2)              [100%] 2 of 2 ✔
[5b/9faacf] process > uv:search_protein_db (2)           [  0%] 0 of 2
[-        ] process > uv:segment                         -
[-        ] process > uv:qc                              -
[-        ] process > uv:filter_qc                       -
[-        ] process > annotate:careful_frames            -
[-        ] process > annotate:search_hmms               -
[-        ] process > annotate:interpret_hmms            -
[-        ] process > annotate:collect_hmms              -
[-        ] process > annotate:search_protein_db_careful -
@phiweger
Copy link
Owner

how much RAM do you have?

@phiweger
Copy link
Owner

you can pass nextflow run ... --maxram 8 ... to limit this, see

https://github.com/phiweger/uv/blob/main/workflows/processes/uv.nf#L21

the protein search is the (brute force and) most computationally intensive step; on my laptop (16 GB RAM) it completes in about 20 mins for the two test genomes.

@i-artjom
Copy link
Author

I let it run overnight and it finished, but I can't say exactly how long it took (maybe when I run it next time). Also running on 16GB RAM.

@phiweger
Copy link
Owner

doesn't nextflow give you the time it took to run?

@i-artjom
Copy link
Author

Unfortunately I can't see it but maybe it's because I'm still getting an error at annotate:collect_hmms (even though it finishes the process annotate:search_tails):

Error executing process > 'annotate:collect_hmms (1)'

Caused by:
  Process `annotate:collect_hmms (1)` terminated with an error exit status (127)

Command executed:

  cat *.bed > all
  bedtools sort -i all > sorted
  /uv/bin/deduplicate_and_rename.py -i sorted -o annotation.bed --names 43b63e6b-323d-473a-8d19-d2d9238d965c.contig_names.txt

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.sh: line 3: bedtools: command not found

@phiweger
Copy link
Owner

https://github.com/phiweger/uv/blob/main/workflows/processes/uv.nf#L243

bedtools is missing from env.yml, my bad. thanks for spotting. can you add and rerun?

@i-artjom
Copy link
Author

i-artjom commented Mar 2, 2023

runs smoothly now with the test genomes and finishes in 5mins 🤌

@phiweger phiweger closed this as completed Mar 2, 2023
@phiweger phiweger reopened this Mar 2, 2023
@phiweger
Copy link
Owner

phiweger commented Mar 2, 2023

haha, now I still need to fix that

@phiweger phiweger added the bug Something isn't working label Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants