Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle empty event response #70

Merged
merged 4 commits into from
Jan 31, 2024
Merged

Conversation

sarita-maersk
Copy link
Contributor

Currently when there are no events scheduled for a node, the endpoint
curl -H Metadata:true http://169.254.169.254/metadata/scheduledevents?api-version=2020-07-01

  • either returns a valid object with empty events array as shown in Scenario-1
  • or returns empty response as shown in Scenario-2

Scenario-1
Screenshot 2024-01-12 at 5 04 12 PM

Scenarioa-2
Screenshot 2024-01-12 at 4 45 52 PM

In our Azure nodes, the termination handler has been exclusively logging json unmarshal errors as the node's scheduled events response is empty.

{
  "error": "error in json.Unmarshal: unexpected end of JSON input",
  "level": "error",
  "msg": "",
  "time": "2024-01-30T15:27:06Z"
}

Receiving empty response from the scheduled events endpoint is a valid scenario, we should not treat this as an error. In this PR we are logging the required message at info level and returns no error if the response body is empty.

@maksim-paskal
Copy link
Owner

@sarita-maersk thanks for your contribution, I never see that behavior in our clusters, in documentation that also never been mention. But if you see that errors in real life - agree, seems that we needed that changed.

How frequently you see that error in your clusters? What about raise log level to WARN?

There are some linter issue, please fix that.

@maksim-paskal maksim-paskal added the enhancement New feature or request label Jan 31, 2024
@maksim-paskal maksim-paskal self-assigned this Jan 31, 2024
@sarita-maersk
Copy link
Contributor Author

Thank you @maksim-paskal for the quick review. Please find my response inline.

How frequently you see that error in your clusters?

We run ~350 spot nodes in our telemetry cluster and the frequency of this error ranges from 1000 to 10,000 in a span of 24hrs.

image

What about raise log level to WARN?

Since Azure document doesn't mention anything about empty response, I am in favour of making this log level to WARN.

There are some linter issue, please fix that.

Fixed.

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2a91db7) 98.53% compared to head (ac09ac8) 98.53%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #70   +/-   ##
=======================================
  Coverage   98.53%   98.53%           
=======================================
  Files           6        6           
  Lines         205      205           
=======================================
  Hits          202      202           
  Misses          2        2           
  Partials        1        1           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maksim-paskal maksim-paskal merged commit ed879b5 into maksim-paskal:main Jan 31, 2024
7 checks passed
@maksim-paskal
Copy link
Owner

@sarita-maersk, your changes were released. I also added debug logs for requests in the new release. This can help us debug other responses that aren't well-documented in the future.

Thank for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants