Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(web): add mutation badges #1446

Closed
wants to merge 4 commits into from
Closed

Conversation

ivan-aksamentov
Copy link

@ivan-aksamentov ivan-aksamentov commented Mar 22, 2024

preview URL: https://feat-mutation-badges.loculus.org/seq/LOC_000001Y.1

Summary

Add mutation badge React components for nuc and aa mutations.

This is a simplified version of what's there in Nextclade or Covariants. I had to port it from styled-components to tailwind.

Badge segments are colored dynamically, depending on a character or gene name. I use style attr to handle that.

Additionally, the text is being rendered of light or dark color depending on the contrast ratio relative to background color. I had to add polished NPM package to calculate contrast ratio.

Notes:

  • There is no fixed list of genes/CDSes avaialble, so it's not immediately possible to render stable (recognizable) gene/CDS colors, which may increase readablity of aa badges significantly. In Nextclade the list is provided from "backend" and then gene index in the list is used to select color from a fixed list of colors (with index wrap on overflow), and in Covariants the list is static, as it only deals with SC2. In both cases, color palette is taken from nextstrain.org "Nucleotide diversity of genome" panel, as something we are the most accustomed to in Nextstrain organization. In Loclulus I decided to use color hash algorithm to generate colors from gene name strings for now. If backend could provide this data, then UX could be further improved.

  • It seems that DataTable.astro is only able to render values that are strings or numbers. I don't immediately see how the badges can be incorporated there without hacks. I need a bit more details of what's planned for this component. So for now I just temporarily added some hardcoded badges above the table as a showcase.

  • From a quick look, I found that backend currently provides mutations in string format, e.g. A123G. This means that these strings need to be parsed in order to extract gene, position, and ref and query characters. In Covariants this is done using regex like that. It's a bit of a hairy adventure though. So if backend could precompute and supply structured data that'd be great.

  • I use exact values in tailwind classes, because these badge components don't fit well into the tailwind paradigm. These components should probably be extracted into some resusable NPM package with some configurable styling. And we'd share it across projects. Though the gene colors is an issue in this case.

Screenshot

01 png

With Dark Reader:

02 png

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by an appropriate test.

Add mutation badge React components for nuc and aa mutations.

This is a simplified version of what's there in [Nextclade](https://github.com/nextstrain/nextclade/blob/c9d43865f7df492a3c2a6b2e1a9287c670155fd9/packages/nextclade-web/src/components/Common/MutationBadge.tsx) or [Covariants](https://github.com/hodcroftlab/covariants/blob/17884e1c0bf55602246f77aa31a70ee6a1c8d659/web/src/components/Common/MutationBadge.tsx). I had to port it from styled-components to tailwind.

Badge segments are colored dynamically, depending on a character or gene name. I use `style` attr to handle that.

Additionally, the text is being rendered of light or dark color depending on the contrast ratio relative to background color. I had to add `polished` NPM package to calculate contrast ratio.

Notes:

- There is no fixed list of genes/CDSes avaialble, so it's not immediately possible to render stable (recognizable) gene/CDS colors, which may increase readablity of aa badges significantly. In Nextclade the list is provided from "backend", and in Covariants the list is static, as it only deals with SC2. In Loclulus I decided to use color hash algorithm to generate colors from gene name strings for now. If backend could provide this data, then UX could be further improved.

- It seems that `DataTable.astro` is only able to render values that are strings or numbers. I don't immediately see how the badges can be incorporated there without hacks. I need a bit more details of what's planned for this component. So for now I just temporarily added some hardcoded badges above the table as a showcase.

- From a quick look, I found that backend currently provides mutations in string format, e.g. `A123G`. This means that these strings need to be parsed in order to extract gene, position, and ref and query characters. In Covariants this is done using regex [like that](https://github.com/hodcroftlab/covariants/blob/17884e1c0bf55602246f77aa31a70ee6a1c8d659/web/src/components/Common/parseAminoacidMutation.ts). It's a bit of a hairy adventure though. So if backend could precompute and supply structured data that'd be great.

- I use exact values in tailwind classes, because these badge components don't fit well into the tailwind paradigm. These components should probably be extracted into some resusable NPM package with some configurable styling. And we'd share it across projects. Though the gene colors is an issue in this case.
@theosanderson theosanderson added the preview Triggers a deployment to argocd label Mar 22, 2024
@corneliusroemer
Copy link
Contributor

The badges look nice! Here in higher resolution:
image

  • Using hashes for gene colors is OK for now, I think that's totally fine.
  • I think you're the first to try the website in dark mode, there might be all sorts of issues elsewhere, or we're lucky and it works out of the box
  • What do you see as the main benefit of parsing in the backend rather than in astro/js code?
  • Making an npm package sounds nice, that way the badges could also be shared with Genspectrum (in case @chaoran-chen is interested). Not sure how much work this would be for you. Could the gene colors be set in some config file? With the option to use your hash color as default?
  • It's not a problem to use exact values in tailwind, is it? How do you feel it doesn't fit into the paradigm?

@theosanderson
Copy link
Member

theosanderson commented Mar 22, 2024

  • Cool! Thanks!
  • I had thought we didn't have a dark mode! What did you do to get it?
  • This would fit nicely into our new paradigm for customDisplay fields: Add display name to metadata fields #1442
  • I guess for some species where every sequence has a ton of mutations relative to ref we may need to truncate or avoid mutation displays (true whether or not we have the badges)

@theosanderson
Copy link
Member

Re: tailwind. It's totally fine to mix tailwind classes and style = {{background:'xxx'}} as here. I don't think we need to be dogmatic about just using tailwind, I've already mixed in styles somewhere.

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Mar 22, 2024

It should be possible to configure a data type so that we can render mutation badges, similar to how we show the "DataUseTermsHistoryModal" which is a react component

{customDisplay !== undefined && customDisplay.type === 'dataUseTerms' && (
<>
{value}{' '}
<DataUseTermsHistoryModal
dataUseTermsHistory={dataUseTermsHistory}
client:only='react'
/>
</>
)}
</div>

If you rebase on main, you'll get this version.

It seems that DataTable.astro is only able to render values that are strings or numbers.

We'd set a customDisplay.type === nucleotideMutations, that type can be passed in: #1446

tableData.map(({ label, value, customDisplay }) => (
<tr>

@theosanderson I think Ivan uses a browser extension, maybe this one even:
image

I now also have darkmode loculus 😄
image

@ivan-aksamentov ivan-aksamentov marked this pull request as ready for review March 22, 2024 19:27
@ivan-aksamentov
Copy link
Author

What do you see as the main benefit of parsing in the backend rather than in astro/js code?

The way I see it, conceptually, client has nothing to do with digesting mutations. This should be figured out not just in backend server, but very early during preprocessing stage.

  • If backend takes these from Nextclade JSON, then they are already in structured format.
  • Applying a few dew dozens of regexes to the immutable data on every render for every client opening this page every time is unnecessary. Regexes are slow, especially in these quantities.
  • This parsing is quite error prone - string is an unstructured, unconstrained format and can contain anything.
  • If a wrong string is passed or even is 1 nucleotide letter is wrong, the client would not know what to do with it (render an error? not render anything? render string as is?).

Disadvantages:

  • Structured data will increase payload sent over network. Though, if a page is server-rendered this does not matter.

@ivan-aksamentov ivan-aksamentov marked this pull request as draft March 22, 2024 19:27
@ivan-aksamentov
Copy link
Author

@theosanderson

I had thought we didn't have a dark mode! What did you do to get it?

Yep, that's Dark Reader :)

@theosanderson
Copy link
Member

theosanderson commented Mar 22, 2024

Our backend in this context is LAPIS, so is quite constrained. If we do the regex-y stuff in the file that Cornelius points to, that will happen server-side within Astro (which is probably the only realistic option here in the short-term, other than client-side).

Edit: or I may be wrong, LAPIS may already support this - who knows

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Mar 22, 2024

The string formatting happens here:

function substitutionsToCommaSeparatedString(mutationData: MutationProportionCount[]) {
return mutationData
.map((m) => m.mutation)
.filter((m) => !m.endsWith('-'))
.join(', ');
}

The input data comes from LAPIS: see OpenAPI docs

Unfortunately it seems that LAPIS treats mutations as strings, not as an object. @chaoran-chen

image

It doesn't look like LAPIS supports structured mutations, at least as far as I can tell from the docs.
image

@chaoran-chen
Copy link
Member

LAPIS indeed only returns mutations as strings.

@Taepper
Copy link
Contributor

Taepper commented Mar 23, 2024

GenSpectrum/LAPIS-SILO#357 would be ready. Depending on the urgency you want to clean this up with
(This does require propagation to LAPIS, docs, cov-spectrum, loculus)

@chaoran-chen
Copy link
Member

Thanks Alex! That was fast!

In LAPIS, I think that the default should remain as is but we can offer a parameter for requesting the mutations in the structured format. Let's discuss it next week.

@fengelniederhammer
Copy link
Contributor

  • So if backend could precompute and supply structured data that'd be great.

The PR to implement that in LAPIS is open: GenSpectrum/LAPIS#723

@anna-parker
Copy link
Contributor

As GenSpectrum/LAPIS#723 is now merged I will modify loculus to use the processed mutation information instead of the mutation strings - afterwards we should be able to move forward with this feature :-)

@corneliusroemer
Copy link
Contributor

@anna-parker is it possible that this PR is now no longer relevant as you've already implemented mutation badges?

@anna-parker
Copy link
Contributor

@corneliusroemer true - I will close this then! Thanks @ivan-aksamentov for your work on this - I used it a lot in #1666 which has now been merged.

@anna-parker anna-parker closed this May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants