Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with server-side tile composition #183

Open
drnextgis opened this issue Aug 23, 2024 · 5 comments
Open

Performance issues with server-side tile composition #183

drnextgis opened this issue Aug 23, 2024 · 5 comments

Comments

@drnextgis
Copy link
Contributor

drnextgis commented Aug 23, 2024

We’re trying to replace our current tile generation method (client-side composition using a /cog endpoint) with a server-side approach using the /searches endpoint. However, this new method is up to five times slower in some tests. Is this because the same level of performance can’t be achieved with titiler-pgstac for server-side tile composition on the same resources?

Here’s a snippet demonstrating the current approach. The issue is that it results in a high number of requests to the web server. We considered switching to /searches as a potential improvement, but so far, we haven’t achieved comparable performance.

import requests

from io import BytesIO
from PIL import Image
from concurrent.futures import ThreadPoolExecutor


def stack_images(images):
    width, height = images[0].size
    new_image = Image.new("RGBA", (width, height))
    for img in images:
        new_image.paste(img, (0, 0), img)
    return new_image


def get_urls(search_id, tile, aname="analytic"):
    z, x, y = tile
    assets_url = f"https://{titiler_url}/searches/{search_id}/tiles/WebMercatorQuad/{z}/{x}/{y}/assets"

    urls = []
    response = requests.get(assets_url)
    for item in response.json():
        urls.append(item["assets"][aname]["href"])
    
    print(f"Number of assets: {len(urls)}")

    return urls


def get_tile(url, tile):
    z, x, y = tile
    tile_url = f"https://{titiler_url}/cog/tiles/{z}/{x}/{y}?bidx=1&bidx=2&bidx=3&format=png&scale=2&tileMatrixSetId=WebMercatorQuad&url={url}"

    response = requests.get(tile_url)
    image = Image.open(BytesIO(response.content))

    return image


if __name__ == "__main__":
    tile = (10, 173, 407)
    search_id = "b9440824baca3a312082e3814a0f5c1b"
    urls = get_urls(search_id, tile)
    with ThreadPoolExecutor() as executor:
        images = list(executor.map(get_tile, urls, [tile] * len(urls)))
        img = stack_images(images)
        img.save("tile01.png")
$ time python local.py
Number of assets: 75
python local.py  5,18s user 0,12s system 77% cpu 6,815 total
@drnextgis
Copy link
Contributor Author

drnextgis commented Aug 26, 2024

Here is the configuration we are using:

# GDAL Config
CPL_TMPDIR=/tmp
GDAL_CACHEMAX=75%
GDAL_INGESTED_BYTES_AT_OPEN=32768
GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
GDAL_HTTP_MULTIPLEX=YES
GDAL_HTTP_VERSION=2
VSI_CACHE=TRUE
VSI_CACHE_SIZE=536870912
MOSAIC_CONCURRENCY=1
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_SESSION_TOKEN=

Based on my tests, it’s clear that MOSAIC_CONCURRENCY and GDAL_CACHEMAX have the most significant impact:

| GDAL_CACHEMAX               | MOSAIC_CONCURRENCY             | Server Side Mosaic | Client Side Mosaic |
|-----------------------------|--------------------------------|--------------------|--------------------|
| 75% (total mem: 16 GB)      | 1                              | 65s                | 6,989s             |
| 75% (total mem: 16 GB)      | 8 (8 CPUs)                     | 11s                | 6,692s             |
| 200                         | 8 (8 CPUs)                     | 40s                | 32s                |

However, even with the same resources, I was unable to achieve comparable performance for server-side rendering as I did for client-side rendering. Probably I need to try with more resources?

All tests were conducted on the same single tile.

When I set MOSAIC_CONCURRENCY=20, the server process gets killed.

@drnextgis
Copy link
Contributor Author

drnextgis commented Aug 26, 2024

From what I understand, titiler-pgstac uses mosaic_reader, so I attempted to rewrite the code from my initial message using it (local-rio-tiler.py):

from rio_tiler.io import  Reader
from rio_tiler.mosaic import mosaic_reader

def reader(asset: str, *args, **kwargs):
    with Reader(asset) as src:
        return src.tile(*args, **kwargs)

img, assets = mosaic_reader(urls, reader, x, y, z, indexes=[1, 2, 3], tilesize=512, threads=8)

However, it works much more slowly (30 s vs 7 s). Using the same environment variables as for titiler-pgstac, it shows similar performance as when using the /searches endpoint (as expected), but it's still twice as slow compared to the approach mentioned in the initial message of the thread.

$ time python local.py
python local.py  10,11s user 0,10s system 196% cpu 5,201 total

$ time python local-rio-tiler.py 
python local-rio-tiler.py  35,21s user 2,09s system 310% cpu 11,996 total

@drnextgis
Copy link
Contributor Author

I have a hypothesis that might explain the observed behavior: In the first case, we use ThreadPoolExecutor solely for I/O-bound tasks (retrieving PNG tiles), whereas, in contrast, mosaic_reader internally uses ThreadPoolExecutor not just for data download but also for reprojecting the data to the tile's CRS and to mosaic assets, which is a CPU-bound task. @vincentsarago what do you think?

@drnextgis
Copy link
Contributor Author

If there's anything I can do to help move this issue forward, please let me know. However, at this point, I'm leaning towards believing it's a design problem, and without refactoring of titiler-pgstac/rio-tiler, there may not be much we can do.

@vincentsarago
Copy link
Member

I think most of the issue is that you're dealing with a large number of assets (75).

As you mentioned, the way MosaicBackend/rio-tiler is designed is by using Threads to distribute the asset reading. As mentioned in https://cogeotiff.github.io/rio-tiler/mosaic/#smart-multi-threading we're trying to have a smart approach but sadly sometime we can't outsmart the task!

if you're tile need to be composed of more than a couple assets, there is no magic!

That's said I'm always interested to see if we can make rio-tiler/titiler better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants