Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cutoff for SNV uncertainty #18

Open
liuxianghui opened this issue Sep 26, 2017 · 2 comments
Open

Cutoff for SNV uncertainty #18

liuxianghui opened this issue Sep 26, 2017 · 2 comments

Comments

@liuxianghui
Copy link

Could you kindly suggest the cutoff for average error in those inferences? Is 10% OK?

In the example, 'Which we interpret as the best run had six haplotypes, five of which we are confident in and the average error in those inferences was 1.6%. The best haplotypes are given by the file ClusterEC_6_2/Filtered_Tau_star.csv. This is what we will use in the analysis below.'

In the paper, 'we calcu- lated the number of haplotypes that had a mean SNV uncertainty (see above) below 10% and a mean relative abundance above 5%. We chose the optimal G to be the one that returned the most haplotypes satisfying these conditions of reproducibility and abundance.'

@chrisquince
Copy link
Owner

Yes for real data analyses I tend to use 10%, that may seem quite high but I believe it is an overestimate of the true uncertainty.

@liuxianghui
Copy link
Author

So your suggestion is to give up those Clusters with over 10% uncertainty?
I found in the latest paper S10. Summary of results from applying the DESMAN pipeline to the 32
Tara MAGs with coverage > 100.
In the table, the Err (the estimated percentage uncertainty in those inferred haplotypes.) seems to quite big. Much larger than 0.10 ??

Moreover, I am a bit confused with ClusterEC.fa . As you see in the E.coli example you tried to merge 5 clusters first but for the real dataset, each cluster is assumed as a MAG and DESMAN is applied for each cluster. ...

By the way, could you kindly comment on the variants positions on core COGs? How many are needed to run the DESMAN process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants