-
Notifications
You must be signed in to change notification settings - Fork 78
Source for number of variants and samples on main page #129
Comments
1d272c3
How'd you do this? I just downloaded the file and there are 1410 unique positions that fall into this criterium: import pandas as pd
df = pd.read_csv('https://covid19.galaxyproject.org/genomics/4-Variation/variant_list.tsv', sep='\t')
intrahost = df[(df['AF'] > 0.05) & (df['AF'] < 0.95) ]
len(intrahost.POS.unique()) returns 1410. More if you also count by variant. |
Ah, I pasted the same result snippet twice above, sorry! |
... and if I go with the filter which seems to be what you obtained, right? |
I used plain Python for this:
with, e.g.,: |
Currently, the main page states this under Results for Genomics:
This leaves two questions:
When I analyzed https://covid19.galaxyproject.org/genomics/4-Variation/variant_list.tsv today, I got this for the filter condition
0.95 >= float(af) >= 0.05
:Samples with variants: 378
Total number of variants observed: 260
Number of sites observed to carry variants: 259
For the fixed differences I tried
float(af) == 1.0
giving:Samples with variants: 55
Total number of variants observed: 27
Number of sites observed to carry variants: 27
and
float(af) > 0.95
resulting in:Samples with variants: 378
Total number of variants observed: 260
Number of sites observed to carry variants: 259
The text was updated successfully, but these errors were encountered: