Improve book library page query performance on title, titleIgnorePrefix, and addedAt sort orders. #3952
+407
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Brief summary
Significantly improves book library Sequelize page queries for the following sort orders:
Which issue is fixed?
This partially fixes #2073 (resolving the book library load times, but not the podcast library load times)
In-depth Description
After digging more into the details of the issues people were complaining about in #2073 and doing additional performance analysis in Sequelize query to bring the page here, I made the following observations:
1 is by far the most serious problem, and also causes significant degradation in query performance as the offset becomes larger.
When the main query is sorting by title:
it's evident in the query plan that the query engine cannot use the existing book.title index, and needs to build a temporary tree for sorting.
Even when you remove the
feeds
table join from the query:The query plan still doesn't make use of the book.title index:
The significant boost in performance can come only if the title column is put in the libraryItems table, and an index on
(libraryId, mediaType, title)
is built. This way, filtering and sorting happens at the same time, and the index can be traveresed very quickly to reach the required offset without needing to look at the tables themselves.So with a query like this:
We get the following query plan:
Which is optimal! (or, to be more precise, optimal given the current architecture)
Resolution
The following changes were made:
title
andtitleIgnorePrefix
columns were added tolibraryItems
books
change.libraryItems
:(libraryId, mediaType, title)
(libraryId, mediaType, titleIgnorePrefix)
(libraryId, mediaType, createdAt)
findAll
is called instead offindAndCountAll
How have you tested this?
I tested on loading 72 consecutive page of 35 books each, on each of the above sorting orders, on an ABS docker container running on a Synology 920+ NAS (I wanted to test on a hardware that was much weaker than my dev machine).
Results
All measurements are in ms.
Summary
Overall, we see a 94-95% drop (!) in mean and median query time.
Standard deviaion also reduces drastically from ~400 to ~10.
Note how the steady rise in query time (as the the requested offset grows) which is quite visible before, is not noticable after.
Sorting by title - before:
Sorting by title - after:
Sorting by titleIgnorePrefix - before:
Sorting by titleIgnorePrefix - after:
Sorting by addedAt - before:
Sorting by addedAt - after: