-
-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operations are slower with fts5 database #451
Comments
@whmountains thought we could track the performance stuff here. |
@whmountains it would be neat if we had these profiling calls built into the library, I'm having to add them here but if we call these functions rarely the overhead of the profiling calls (using time.time() difference here) should be very low. |
@whmountains, I'm seeing things take much less time here, M3 MacBook:
Opening 'select part' is ~1 second. Sorting takes a handful of seconds. |
Thanks for checking into that @chmorgan ! I wonder what could be the source of the speed difference? The only thing I can think of is that your M3 MacBook is definitely faster than my 2020 MBP. Still, a factor of 10 seems like a lot. Are we talking about the same function? The slowness I am observing is with |
|
Looks like we are benchmarking the same thing with the same technique. Funny thing happened while testing today. While testing #452 I noticed it was a lot faster and added the timing code. Sure enough So I went back to the latest from the official repo, and am seeing over 14s and a spinning beachball again:
There must be some hidden variable which is causing the huge speed difference. |
For consistency's sake, let's define a standard way of testing. I propose the following:
|
After following those steps many times trying out different configurations, I have some observations:
I also sporadically observed:
I could not reproduce either of these reliably. But I can say anecdotally that the 13-15s performance band is what I was stuck in on my last real-world project using this tool. Was quite annoying for sure! In summary I think we can say the following:
I'm currently suspecting disk cache. It makes sense that the DB would be cached after the first query, and a 10x speedup in table scan speed would not be unexpected if the database file happened to be cached in RAM. I have 32Gb of RAM on my MBP so a lot of caching is certainly happening. |
Update: I now am quite certain that this is related to the database file getting cached in RAM. Steps to reproduce on MacOS:
If disk cache is indeed the issue, it will be hard to reproduce this consistently because it is up to the OS to decide when to cache and when not to cache. On my system, and probably yours as well @chmorgan, the OS has figured out that the parts DB is a hot file and aggressively caches it. On the system of a new user, or on any system with a lot of other apps open, it would be much less likely to get cached. Something in |
Is this also fixed in #503 ? |
The text was updated successfully, but these errors were encountered: