Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect gff3 output file format #24

Open
nsuvarnaiari opened this issue Sep 16, 2019 · 2 comments
Open

Incorrect gff3 output file format #24

nsuvarnaiari opened this issue Sep 16, 2019 · 2 comments

Comments

@nsuvarnaiari
Copy link

Hi Josh,

The format of output gff3 file from GALES is incorrect.
For reference here is the link to "Annotating genomes with gff3" from NCBI.

  1. "polypeptide" feature is not a valid feature type. Hence the line with that feature can be removed.
  2. replace "product_name=" with "product=" attribute.
  3. The protein product name info should be added to CDS feature line. If product name is not specified in mRNA feature line, the tbl2asn command takes up product name from CDS feature. But if no product name is added to CDS feature line, it will automatically be called a hypothetical protein. If product name is assigned to only mRNA feature line, it will be added to gbf file (gbf is another output from conversion of gff3 to asn) as a Note.

Thanks,
Suvvi

@jorvis
Copy link
Owner

jorvis commented Sep 16, 2019

The problem here is that this is the GFF3 output we started using before NCBI published support for their specific flavor of GFF3. Features like 'polypeptide' are valid in GFF3, as are any Sequence Ontology term, but NCBI just doesn't support it.

But when a group as large as NCBI publishes an internal standard we have little choice but to support it directly. Now, the question is whether we should change the default output of GALES to match this or add an option to specify NCBI-specific GFF3.

@nsuvarnaiari
Copy link
Author

I think adding an option to output NCBI-specific GFF3 is a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants