You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use sumstats.py lift to lift hg19 SNPs in 5 GWAS sumstats files over to hg38. I have already run `sumstats.py csv' to standardise these files.
SNP CHR BP PVAL A1 A2 N Z OR BETA SE
rs11579922 1 1036860 .1662 A C 50914 -1.3868004 .97278 -.02759733 .0199
rs11579015 1 1036959 .1067 T C 49514 -1.6133769 .96435 -.03630098 .0225
rs11260592 1 1037303 .1716 T C 50914 -1.3683987 .97287 -.02750481 .0201
rs11260593 1 1037313 .169 A G 50914 -1.3730014 .97278 -.02759733 .0201
rs66622470 1 1038088 .1659 C G 50914 1.3867192 1.02798 .02759571 .0199
However, I'm getting the following error for 2 out of the 5 files so far - the others are still running:
Traceback (most recent call last):
File "python_convert/sumstats.py", line 2212, in <module>
args.func(args, log)
File "python_convert/sumstats.py", line 1375, in make_lift
df.loc[index, cols.CHR] = int(lifted[0][0][3:])
ValueError: invalid literal for int() with base 10: '2_KI270773v1_alt'
Analysis finished at Tue May 18 18:02:06 2021
Total time elapsed: 2.0h:7.0m:48.60999999999967s
This appears to relate to entries in the the 'hg19ToHg38.over.chain.gz' file as there are no alt_chrs in the original GWAS sumstat files. There are 114 alt_chrs in total.
I'm wondering if there is a way around this, i.e. can I add a parameter to ignore/deal with these loci? What exactly does `--keep-bad-snps' do? I'm reluctant to do this without knowing fully what it does.
Interestingly, this error does not arise when I use the standard liftover tool, but using that means I need to generate bed files first. sumstats.py would be the neatest option for me.
I'm trying to use
sumstats.py lift
to lift hg19 SNPs in 5 GWAS sumstats files over to hg38. I have already run `sumstats.py csv' to standardise these files.However, I'm getting the following error for 2 out of the 5 files so far - the others are still running:
This appears to relate to entries in the the 'hg19ToHg38.over.chain.gz' file as there are no
alt_chrs
in the original GWAS sumstat files. There are 114alt_chrs
in total.I'm wondering if there is a way around this, i.e. can I add a parameter to ignore/deal with these loci? What exactly does `--keep-bad-snps' do? I'm reluctant to do this without knowing fully what it does.
Interestingly, this error does not arise when I use the standard liftover tool, but using that means I need to generate bed files first.
sumstats.py
would be the neatest option for me.Here is my code:
I could also remove these entries from the chain file, but I thought I'd ask if there is a way to deal with them before proceeding.
Many Thanks.
The text was updated successfully, but these errors were encountered: