Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use full header in the output #33

Open
apcamargo opened this issue Jul 1, 2024 · 1 comment
Open

Do not use full header in the output #33

apcamargo opened this issue Jul 1, 2024 · 1 comment

Comments

@apcamargo
Copy link

Description:

skani currently outputs the full sequence header in its results. When the header is long and contains spaces, this output becomes difficult to read and parse. Displaying only the sequence identifier (the portion before the first white space) would align skani's output with the conventions of most other tools.

Proposed Solution:

Introduce an optional parameter to enable this behavior, allowing users to toggle between displaying the full header and only the sequence identifier.

@bluenote-1577
Copy link
Owner

Hi antonio,

Thanks for raising this. I think including an option to toggle only the first token of the header is a good idea and agree with you.

I personally like having the whole header for readability: for many annotated genomes, the organism name can be informative and is not in the sequence identifier, hence the default behavior. Personally, this helps my sleuthing a lot. But it can get unruly and long. And given there should be no tabs in the whole header, I think parsing is mostly OK.

I'll mull over how to include this option. Maybe a --seq-id-only option

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants