Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow very long search queries #967

Merged
merged 17 commits into from
Feb 12, 2024
Merged

Allow very long search queries #967

merged 17 commits into from
Feb 12, 2024

Conversation

theosanderson
Copy link
Member

@theosanderson theosanderson commented Feb 9, 2024

resolves #966

For large search queries, this keeps search state with POST submissions instead of in the URL. I have tested that it works for pagination, ordering, etc. The easy way to test is to set the threshold for turning the query into a post request much shorter, I have a preview of that here: https://searchpostpreview.loculus.org/, where all searches should use this method. But in general, normal searches will stay in the URL, as in the preview for this branch: https://searchpost.loculus.org/

I've also added a dev-feature that expands the "Load example data" to load any number of randomly created entries which is important for testing pagination and should also help with testing submission features.

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by an appropriate test.

@theosanderson theosanderson added the preview Triggers a deployment to argocd label Feb 9, 2024
@theosanderson theosanderson changed the title wip: Allow very long search queries Allow very long search queries Feb 9, 2024
@theosanderson
Copy link
Member Author

I'm uncertain how to test this - my attempt to do so by entering really long text in the field timed out and I don't have experience of these testing libraries.

@theosanderson theosanderson added this to the MVP milestone Feb 9, 2024
@corneliusroemer
Copy link
Contributor

Cool that we can handle large queries via POST, that's neat.

I'm not sure I understand the new dev feature, can you explain it differently?

@theosanderson
Copy link
Member Author

theosanderson commented Feb 9, 2024

Sorry for the poor explanation, but probably the easiest way for you to understand it is to go to https://searchpost.loculus.org/dummy-organism/submit and type in 3000 in the little box next to Load example data and then click load example data :) (and proceed through the process).

@theosanderson theosanderson mentioned this pull request Feb 9, 2024
@corneliusroemer
Copy link
Contributor

Oh now seeing it I know what you mean.

Rather than loading only 5 sequences with "load example data" one can configure how many to submit in a text field. Not sure why I didn't get the initial explanation.

@JonasKellerer
Copy link
Contributor

I have a few questions about this:

  • what is the limiting factor in the url?
  • can one still share the link to a search?

@theosanderson
Copy link
Member Author

theosanderson commented Feb 10, 2024

(Partly answering the earlier version of Jonas's Qs :) )

So I really like the current behaviour of the search form, which is that the query is kept in the URL, which means that links can be shared and bookmarked. For any of the current searches, the length of URLs -- 2048 characters officially, longer in many browsers, but worth sticking with 2048 which is the limit in Edge -- will be entirely sufficient. I agree that users being able to edit the URL isn't super important, but I still think the way we do it here is elegant. Actually - I use the base64 approach on my own website:

https://taxonium.org/?backend=https%3A%2F%2Fapi.cov2tree.org&srch=%5B%7B%22key%22%3A%22aa1%22%2C%22type%22%3A%22name%22%2C%22method%22%3A%22text_match%22%2C%22text%22%3A%22asd%22%2C%22gene%22%3A%22S%22%2C%22position%22%3A484%2C%22new_residue%22%3A%22any%22%2C%22min_tips%22%3A0%7D%5D

and I think https://main.loculus.org/dummy-organism/search?country=Switzerland&division=Bern&page=1 is much more satisfying (and shorter to express the same thing)

So that's all great (from my POV).

The issue is that another important feature (#99) is to be able to paste say 1000 sequence IDs in a text area and retrieve them all. This is a well-used feature. There's no way we can keep those 1000 sequence IDs in the URL, of course. I think the solution implemented here is pretty nice because it's agnostic to what the search queries are, but just says that once queries get longer than URLs can support we will no longer put them in the URL. Quite a bit of the refactoring here actually unifies how the system works, i.e. clicking on pagination now uses the same routing function as everything else to get to the new page, rather than just patching the existing URL.

can one still share the link to a search?

No, the only way we could achieve that would be with a database to store search queries, but I don't think that's something we want to start doing. It's not important to be able to share a link to a search where we've entered 1000 sequence IDs.

@corneliusroemer
Copy link
Contributor

Thank you for this great dev feature of allowing submission of e.g. 10,000 test sequences

image

Really useful to populate the test organism with 10k records and see how well our backend and frontend handle the load, plus use pagination etc.

@corneliusroemer
Copy link
Contributor

Works, though in order to allow batch search, we need to have the query support for it - right now one can only search for a single accession AFAICT

image

@theosanderson
Copy link
Member Author

I think if I understood the feature exists in LAPIS but not in the website probably

Copy link
Contributor

@fengelniederhammer fengelniederhammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I currently also don't have a good idea how to test it. I don't know how to test that the hidden form sends the correct request, but maybe msw offers something?

We could also add an e2e test that covers it and checks the url and that the search fields are prefilled correctly.

.github/scripts/setup_codespace_env.sh Outdated Show resolved Hide resolved
website/src/components/DataUploadForm.tsx Outdated Show resolved Hide resolved
website/src/components/DataUploadForm.tsx Outdated Show resolved Hide resolved
website/src/routes.ts Outdated Show resolved Hide resolved
website/src/pages/[organism]/search/index.astro Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Triggers a deployment to argocd
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Support very large search queries
5 participants