Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A process in the process pool was terminated abruptly while the future was running or pending. #72

Closed
HornDW opened this issue Aug 3, 2023 · 8 comments

Comments

@HornDW
Copy link

HornDW commented Aug 3, 2023

Hi,

I'm new to using the NanoComp tool and have successfully generated reports before, for reference this run was using a rather large BAM file ~73 GB in size and attempting to compare it to another BAM file ~24 GB in size. It runs for about an hour without indicating any issue in the Log (See attached) other than that there are a lot of contigs.
NanoComp_20230803_1512.log

Is this an issue seen when trying to compare larger nanopore BAM files? what would you recommend I do to get around this?

NanoComp --bam Promethion1.dMDA.bam Promethion2.dMDA.bam --outdir Promethion_Comp
Traceback (most recent call last):
File "/home/dominic/.local/bin/NanoComp", line 8, in
sys.exit(main())
File "/home/dominic/.local/lib/python3.10/site-packages/nanocomp/NanoComp.py", line 52, in main
datadf = nanoget.get_input(
File "/home/dominic/.local/lib/python3.10/site-packages/nanoget/nanoget.py", line 110, in get_input
dfs=[out for out in executor.map(extraction_function, files)],
File "/home/dominic/.local/lib/python3.10/site-packages/nanoget/nanoget.py", line 110, in
dfs=[out for out in executor.map(extraction_function, files)],
File "/usr/lib/python3.10/concurrent/futures/process.py", line 570, in _chain_from_iterable_of_lists
for element in iterable:
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
yield _result_or_cancel(fs.pop())
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
return fut.result(timeout)
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

@wdecoster
Copy link
Owner

Hi!

Hmmm it is remarkable how little information the log file has regarding this error, that is unfortunate. That gives me not a lot of ideas on how to debug this. One solution could be to first extract the metrics from the BAM files into a .arrow file with the tool here: https://github.com/wdecoster/make_arrow

you can download a binary from the releases. It is poorly documented, but also doesn't have a lot of options... The arrow files are then ready to be used by NanoComp (which should be a lot quicker too). Please let me know how that goes.

Cheers,
Wouter

@HornDW
Copy link
Author

HornDW commented Aug 7, 2023

Thanks for getting back to me,

I've downloaded and run the tool and generated 2 .arrow files for the runs I want to compare.
I ran --feather
This generated some but not all of the graphs that would be expected and the NanoStat file gave some funny-looking numbers for percentage identity. I've attached the NanoStat output and the log file. For reference, the runs used were merged BAM files, not sure if that had anything to do with it.

NanoComp_20230807_2051.log
NanoStats.txt

All the best,
Dominic

@wdecoster
Copy link
Owner

Interesting, I suspect that the accuracy is off by 100-fold. Okay that should be easy to fix for me.

Yes, the make_arrow tool does not extract all information from the bams, so the arrow file does not contain all necessary information for some plots. Which plots specifically would you like to be added?

@HornDW
Copy link
Author

HornDW commented Aug 8, 2023

Thanks again for the help,

I think an overlayed histogram of read lengths, and a quality score violin plot would be nice to have.

@wdecoster
Copy link
Owner

Ah, I see in the log that NanoComp actually crashed before completing the plots. Okay that seems more problematic, let me look into that :-)

@wdecoster
Copy link
Owner

Do you think it would be possible to share (one or both) of the arrow files?

@HornDW
Copy link
Author

HornDW commented Aug 11, 2023

Yes, that's fine, I just got permission to share them and Christos Proukakis told me he says hi 😄.

https://www.dropbox.com/scl/fi/fieru0b7mb1r19rus0q3o/Compressed-arrows.zip?rlkey=6qig9062u4o5fs74yvupgkmdh&dl=0

You should be able to download a .zip with them inside using the above link

@wdecoster
Copy link
Owner

Oh hi!

I am making a new release of make_arrow in a couple of minutes that should fix this - but let me know if you still experience issues :)

Wouter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants