Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s-apiserver requests' responses status code like 429 should trigger request retries #436

Open
consideRatio opened this issue Sep 16, 2020 · 2 comments
Labels

Comments

@consideRatio
Copy link
Member

consideRatio commented Sep 16, 2020

Bug description

We have try/except logic that catches various errors from the api-server when making requests. We should try be a bit more smart about them though as often these errors mean we should retry the attempt rather than raising an exception that leads to a 500 error raised by JupyterHub for a user starting a pod for example.

except ApiException as e:
if e.status != 409:
# We only want to handle 409 conflict errors
self.log.exception("Failed for %s", pod.to_str())
raise

Response status codes to consider

Here are k8s api-server response status codes that is recommended to be followed by a retry of the request according to the Kubernetes community documentation.

image

Related

This issue is the outcome of reviewing another pr, see: #433 (review)

@minrk
Copy link
Member

minrk commented Sep 16, 2020

Makes sense, I think a general

async def retrying_api_request():
    try:
        return (await real_api_request()) or True # ensure truthy, or is the response always truthy?
    except ApiException as e:
        if e.status in RETRY_ERROR_CODES:
            # any special-handling of e.g. 429 with may have a specified retry-after header?
            log.warning(f"Error {e.status} in API request: {e}, retrying...")
            return False
        # let any other error raise
        raise

result = await exponential_backoff(retrying_api_request, ...)

wrapper to use for all API requests seems appropriate. This can probably be folded into (or replace) our current asynchronize utility.

@yuvipanda
Copy link
Collaborator

Replacing asynchronize with this sounds like a great idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants