You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The format of output gff3 file from GALES is incorrect.
For reference here is the link to "Annotating genomes with gff3" from NCBI.
"polypeptide" feature is not a valid feature type. Hence the line with that feature can be removed.
replace "product_name=" with "product=" attribute.
The protein product name info should be added to CDS feature line. If product name is not specified in mRNA feature line, the tbl2asn command takes up product name from CDS feature. But if no product name is added to CDS feature line, it will automatically be called a hypothetical protein. If product name is assigned to only mRNA feature line, it will be added to gbf file (gbf is another output from conversion of gff3 to asn) as a Note.
Thanks,
Suvvi
The text was updated successfully, but these errors were encountered:
The problem here is that this is the GFF3 output we started using before NCBI published support for their specific flavor of GFF3. Features like 'polypeptide' are valid in GFF3, as are any Sequence Ontology term, but NCBI just doesn't support it.
But when a group as large as NCBI publishes an internal standard we have little choice but to support it directly. Now, the question is whether we should change the default output of GALES to match this or add an option to specify NCBI-specific GFF3.
Hi Josh,
The format of output gff3 file from GALES is incorrect.
For reference here is the link to "Annotating genomes with gff3" from NCBI.
Thanks,
Suvvi
The text was updated successfully, but these errors were encountered: