Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare reads in two FASTQ files based on ReadID #77

Closed
LogCrab opened this issue May 7, 2024 · 2 comments
Closed

Compare reads in two FASTQ files based on ReadID #77

LogCrab opened this issue May 7, 2024 · 2 comments

Comments

@LogCrab
Copy link

LogCrab commented May 7, 2024

Hi @wdecoster I am interested in the pairwise identity between each read in two FASTQ both basecalled on the same FAST5 files but with different software, such as Guppy or Dorado. This work was to compared the similarity between different basecalling software.
I think it is theoretically possible because the ReadID is just in the header of each FASTQ. But after extensive search, I found no software can do the job. Is it possible to add this function in NanoComp?
BTW, any plan on rewritting NanoComp in Rust?
Have a nice day!

@wdecoster
Copy link
Owner

That is a highly specific task, and I am reluctant to support it in NanoComp. Would your prefered output be a table with a column of read ID and the pairwise identity?
I guess you could iterate over both fastqs, and use mappy (https://github.com/lh3/minimap2/tree/master/python) to align each read to its counterpart, and then use the NM tag to get the identity?

@LogCrab
Copy link
Author

LogCrab commented May 7, 2024

Thank you for your instruction.

@LogCrab LogCrab closed this as completed May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants