Skip to content

Commit

Permalink
lazy fix for parsing gffs
Browse files Browse the repository at this point in the history
  • Loading branch information
jonas-fuchs committed Nov 13, 2024
1 parent 19054db commit 500c1a5
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion virheat/scripts/data_prep.py
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,13 @@ def parse_gff3(file, reference):
# ignore comments and last line
if not line.startswith(reference):
continue
gff_values = line.split("\t")
gff_values = line.strip().split("\t")
# sanity check that the line has a unique ID for the dict key
# this is a lazy fix as it will exclude e.g. exons without ID and
# only a parent -> fixing this might require more complex parsing
# and data structure
if not gff_values[8].startswith("ID="):
continue
# create keys
if gff_values[2] not in gff3_dict:
gff3_dict[gff_values[2]] = {}
Expand Down

0 comments on commit 500c1a5

Please sign in to comment.