-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unaccounted Hybridisation states #8
Comments
Thank you for bringing this problem to our attention. Usually this is an issue because OpenBabel is returning an SP atom when it infers bonds, which we wouldn't expect for amino acids. You could try minimizing the structures in your dataset or checking for missing atoms. |
That is interesting. For what I see, it happens multiple times n in my structure dataset, comprised by 2K human protein structures from PDBe. Perhaps the vector representation of hybridisation could be modified to a 3-element vector instead of 2? But I guess it would be a different model then, different features. I have noticed another |
Are these modified amino acids? Our policy is that we would rather have GrASP crash when we see something non-standard or low-resolution (when OB fails bond perception) so we aren't silently making predictions on features it has never seen. |
Both of these examples, and a few others of atoms crashing due to unaccounted hybridisation states are all Perhaps a step to deal with altlocs might solve this. |
Okay, that makes sense. OB is probably parsing both |
I will add a check/warning that detects |
So, there is this script: https://github.com/harryjubb/pdbtools/blob/master/clean_pdb.py from Harry Jubb's group. It was to pre-process the structures before running an older version of Arpeggio (https://github.com/harryjubb/arpeggio). Takes PDB format as input, and deals with altLocs, chain breaks, etc. I will try running it and then run |
I recommend printing something when there are |
Many of the proteins on my dataset are crashing on the
featurize_protein.py
, with aKeyError
. The hybridisation state of the atom isSP
, but onlySP2
andSP3
are accounted for in the dictionary. How could this be fixed? Thanks!The text was updated successfully, but these errors were encountered: