Do not use full header in the output #33

apcamargo · 2024-07-01T21:12:48Z

Description:

skani currently outputs the full sequence header in its results. When the header is long and contains spaces, this output becomes difficult to read and parse. Displaying only the sequence identifier (the portion before the first white space) would align skani's output with the conventions of most other tools.

Proposed Solution:

Introduce an optional parameter to enable this behavior, allowing users to toggle between displaying the full header and only the sequence identifier.

The text was updated successfully, but these errors were encountered:

bluenote-1577 · 2024-07-02T17:17:38Z

Hi antonio,

Thanks for raising this. I think including an option to toggle only the first token of the header is a good idea and agree with you.

I personally like having the whole header for readability: for many annotated genomes, the organism name can be informative and is not in the sequence identifier, hence the default behavior. Personally, this helps my sleuthing a lot. But it can get unruly and long. And given there should be no tabs in the whole header, I think parsing is mostly OK.

I'll mull over how to include this option. Maybe a --seq-id-only option

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not use full header in the output #33

Do not use full header in the output #33

apcamargo commented Jul 1, 2024

bluenote-1577 commented Jul 2, 2024

Do not use full header in the output #33

Do not use full header in the output #33

Comments

apcamargo commented Jul 1, 2024

Description:

Proposed Solution:

bluenote-1577 commented Jul 2, 2024