-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_genome_build
: Can't find SNP, CHR, BP in PGC VCFs
#91
Comments
Seems to be an issue reading the header of this VCF (probably to do with the format?) with variant annotation:
|
Other than that it will run once I add
For vcfs if this fails. Let me know if there is any downstream issue of not using variantAnnotation approach? I'll add these fixes into the solution I have for Indels to be pushed to the master branch (not current) so use the github version for this fix (until late April when it's released) |
Excellent, I think this sounds like a solid solution to me. Thanks, Alan! I'll keep you posted about any downstream issues. Currently dealing with one from my more manual solution above, with some weird errors about the rows being out of order (when they don't seem to be) during tabix indexing: |
Added to master branch |
1. Bug description
There seems to be some issues when trying to munge the Psychiatric Genomics Consortium (PGC) sumstats format, which is a bit different from the OpenGWAS format.
FIrst of all, the file names end in ".vcf.tsv.gz" which is confusing and might be tripping up our code that infers file type by extension names. Wondering if this is happening due to a slight discrepancy between how
read_sumstats
andread_header
are inferring file type, because format_sumstats does manage to get partway through before hitting an error (it even correctly counts the number of rows!).Here's part of the header from one of these files:
Console output
Error from the first reprex below.
Also including the full message output.
MungeSumstats_log_msg.txt
Expected behaviour
format_sumstats
is able to run the full pipeline and produce a munged tsv.2. Reproducible example
Code
This produces an error
But this works ok
I tried reformatting the file manually so it was undoubtedly a regular tsv file.
Data
The data can be downloaded here.
Session info
The text was updated successfully, but these errors were encountered: