-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad API Query Performance (1000ms for a basic bounding box request) #170
Comments
Thanks for investigating! We use the mongodb SDK (possibly an outdated version) to construct our queries so I'd need to investigate more as to how to optimize that:
|
It's also worth noting that on my machine the actual query time for the mongodb part takes an average of 42ms for a bounding box query with 88 results, I think the real bottleneck is probably elsewhere when we are marshalling the output into the format required for the API output. For instance, choosing 'compact=true&verbose=false' requires postprocessing of the db results so we can strip out unwanted parts of the object model, so it transfers less data but takes a little longer (not much though). Our mirrors are the primary servers for API reads (a couple of fairly low spec linux servers proxied via cloudflare workers) and they are susceptible to being flooded by requests, which is one of the reasons why we'd rather people run their own mirrors if they need specific query performance. Any optimisations are appreciated though. |
.. and you did this test against the full POI database (117148 record)? Here in docker the mongodb query time is by far the major bottleneck. As I have enabled mongodb query logging I was able to see that the query time was roughly about 350ms (on my macbook, in docker). Without the boundingbox param mongo was able to use the index and so the query ran in about 10ms(!). When not using the bounding box param the response time is way better. I was also playing with the compact and verbose params not being able to get a significant difference. |
Cool, I'm basing my comments on a small benchmark that we have were we start a .net Stopwatch, perform the query, then stop the stopwatch and record the time. How do you monitor MongoDB query performance? I'm far from being an expert in MongoDB. |
In case it's relevant, our mirrors run with 2gb of ram + swap space, less than that quickly runs into query performance issues. |
and to the benchmark is using the full database with 117148 POIs? I think to root of the problem is that the query is not using the index at all and doing a table full scan lead to a major performance issue. Using more RAM MongoDB can cache the results, so doing the same query over and over again should deliver it from the query cache and then it should run fast. But on a regular basis the boundingbox param changes all the time as the user e.g. pans the map, so this will not help much. I suggest to use a 2d index (instead of 2dsphere):
"SpatialPosition.coordinates": { "$geoWithin": { "$box": [ [ SW_LON, SW_LAT ] , [NE_LON, NE_LAT ] ] } },
(..) SW_LON/LAT, NE_LON,LAT being the coordinates of the bounding box. And it could be safe to use the https://docs.mongodb.com/manual/reference/operator/meta/hint/ Unfortunately I do not have a .net environment here nor any .net knowledge so I can not help out with a pull request.. else I will love to.. |
Thanks, yes test are with a recent snapshot of the database and with randomised bounding box coordinates to avoid caching. You can build form source under linux or mac os using |
@webprofusion-chrisc I will go with the MS VisualStudio, this also being available for Mac, to play around with the current codebase and to test out things.
|
@webprofusion-chrisc ok, I was able to get along with the lightweight VS Code. About the .net SDK: for the 3.1 version, the LTS is 3.1.405 but the project configuration uses 3.1.100. Any reason not to use the latest 3.1 LTS release? I did change the version in Anyway, |
@webprofusion-chrisc some update on that: I was able to run the project locally (on the mac, mongodb in docker) and I also have played with the MongoDB query logic. Unfortunately the project is using the old "mongocsharpdriver" driver, which basically puts a legacy 1.x compatibility layer on top of https://mongodb.github.io/mongo-csharp-driver/2.11/getting_started/installation/ The 1.x legacy does not support GeoWithin and consorts AFAICS :( Do you see any chance to update the project to use the current |
Thanks, I've actually attempted the upgrade before but had to back out it for reasons that I can't remember - I think it turned into a lot of extra work and testing. We will upgrade eventually though, it's just about devoting free time to do it - I haven't been putting a lot of time into OCM over the last few years in the hope that others would step up a little :) |
@webprofusion-chrisc I understand.. the pain of legacy code.. Nevertheless I was able to fix the issue, see #171. I have already tested it locally and on my own OCM mirror and it works nicely! I was also able to lower the RAM constraints for the mongodb docker container to 512M to save some resources and this works without any performance tradeoffs. BTW, I did also update https://github.com/ev-freaks/ocm-mirror to use a HAProxy container to add gzip compression. This is also a nice add-on, especially for mobile clients, the gzip compression ratio for the compact json payload being about 10:1 ;) |
Excellent, thanks I've merged that now. Regarding the gzip compression we automatically serve using Brotolli or Gzip etc so you shouldn't have to do that. Note that if testing with curl you'll get the uncompressed response unless you specify |
So it turns out we do need the original index for geoNear queries as otherwise the distance based queries throw an exception, I've patched that now. Your change (for bounding box) is in production now on our API servers. Thanks again. |
Oh, my bad, I was just focusing on the boundingbox queries and did not test anything else :( I just just looking at 818a211, looks good. MongoDB should support both 2d and 2dsphere indexes simultaneously. |
This fixes a regression introduced by bounding box queries. See openchargemap#170 and related
This fixes a regression introduced by bc61dca. See openchargemap#170 and related
I have noticed a quite severe performance degradation of the current OCM public API. For example:
This very basic request, just using a bounding box for filtering and limiting the result set to one single record takes about one second to complete.
I did also setup a OCM Mirror as described here to see if the issue persists and also to be able to debug it. And, yes, it does:
I am getting better results using my own mirror (480ms vs. 1094ms) but this is still quite bad..
I was also looking at the mongoDB requests in docker and I saw this kind of queries:
IMO
$within
will not use the2dsphere
index. To leverage the2dsphere
index the$geoWithin
operator has to be used, see alsohttps://docs.mongodb.com/manual/reference/operator/query/geoWithin/
For my projects I was able to get a way better query performance using the "2d" index (instead of the 2dsphere), btw.
Unfortunately I am a Swift Developer and I do not have any .NET expertise :(
The text was updated successfully, but these errors were encountered: