Skip to content

Commit

Permalink
fix(datasets): check_missing_confs failed + added missing platinum pr…
Browse files Browse the repository at this point in the history
…edictions #147

aflow models will only have 1 pdb and should be tied to the pid. This is an issue for some since the way we grab those files for non-pdbbind proteins is with `f.startswith(pid)` which raises issues when we have two pids one a subsequence of the other (e.g.: PIK3CA.pdb and PIK3CA(Q546K).pdb)

#147
  • Loading branch information
jyaacoub committed Jan 14, 2025
1 parent d3997b7 commit 460fcfd
Show file tree
Hide file tree
Showing 7 changed files with 9,568 additions and 9,202 deletions.
20 changes: 20 additions & 0 deletions playground.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,23 @@
#%%
from src.data_prep.datasets import PlatinumDataset
from src import cfg
import logging
logging.getLogger().setLevel(logging.DEBUG)

#%%
dataset = PlatinumDataset(
save_root= f'/home/jean/projects/data/PlatinumDataset/',
data_root= f'/home/jean/projects/data/PlatinumDataset/raw',
af_conf_dir=f'/home/jean/projects/data/PlatinumDataset/raw/alphaflow_io/out_pdb_MD-distilled/',
aln_dir=None,
cmap_threshold=8.0,

feature_opt=cfg.PRO_FEAT_OPT.nomsa,
edge_opt=cfg.PRO_EDGE_OPT.aflow,
ligand_feature=cfg.LIG_FEAT_OPT.gvp,
ligand_edge=cfg.LIG_EDGE_OPT.binary,
subset=None)

#%%
import logging
from matplotlib import pyplot as plt
Expand Down
Loading

0 comments on commit 460fcfd

Please sign in to comment.