New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

How to estimate read count #19

Open

yejunbin opened this issue Oct 12, 2024 · 1 comment

Labels

yejunbin commented Oct 12, 2024

Hi,

Is sylph possible to estimate read count for each genome or taxonomy, like metaphlan or kraken?

thanks

Owner

bluenote-1577 commented Oct 12, 2024

Maybe I will add an option for estimating the read count. Sylph does not classify reads directly, so only an estimate can be provided.

For now, you can estimate the read count for sylph by doing the following:

Use the -u option. This multiplies the Sequence abundance column by the % of classified reads.
Multiply the Sequence abundance of each row by the # of reads in your dataset. So if your fastq file has 3M reads and a genome has sequence abundance 5%, then it should have 150k reads assigned to it.

I'll probalby add a feature to do this in a new update.

bluenote-1577 added the enhancement label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment