-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seqminer is missing some SNPs in range compared to PLINK #17
Comments
Problem Observed Since the issue seems to occur from a filesize of approx. 2 GB, I'm wondering if there is some 32-bit component that limits memory. Does readVCFToListByRange try to read the entire file into memory before before filtering by range? Expected Appendix |
I made a comparison between the PLINK2 glm result from the some range versus reading the dosage matrix with seqminer. The result from seqminer is missing some SNPs that appeared in the PLINK result from the same range.
PLINK
Prune the .bgen file with PLINK and do glm.
Inspect PLINK glm result and see which positions are used.
seqminer
Load the .bgen file as a dosage matrix with seqminer and inspect the data size. 5766 is way smaller than 11984. So about half the SNPs are missed by seqminer.
The text was updated successfully, but these errors were encountered: