Catch panics in `collector` #1800

Kobzol · 2024-01-12T12:15:11Z

The way the site endpoint is currently set up, it expects that each collector bench_next invocation that starts a collection will eventually call the /perf/onpush endpoint. However, if the collector crashes (which happened today because of a DB timeout), it won't ever call the endpoint. This will cause the following successful collection to mark both collections as completed at once when it calls /perf/onpush, which is not intuitive and causes weird situations with GH comments (two comments being posted at once). We should rather just eagerly notify about the error.

We could perhaps somehow store a special error into the DB, letting it know that a panic has happened, but since the panic can happen because of DB not behaving reliably.. it's hard to say if it would work.

Mark-Simulacrum · 2024-01-12T13:40:32Z

I'm fine with this but I wonder if we should just poke on push within the wrapper script? Or in addition? That seems more reliable to me than catching panics, and should be equally effective?

Kobzol · 2024-01-12T14:54:22Z

We would need to somehow communicate to the bash script that a benchmark took place though (otherwise it shouldn't call onpush). Or make bench_next blocking, but that would not keep the collector in sync with git pushes.

Kobzol · 2024-01-12T15:02:19Z

onpush can't be triggered more than once per a minute, so it would probably be harmless to call it a second time from the bash script, although it's quite hacky. If it didn't make the second call in the 60 second interval, it could in theory finish an ongoing perf. run, but there should not be any ongoing perf. run, since the script won't start a new one until onpush is called..

This means that it should be ok to just call the endpoint always, from the script. We will just spam the website with a request every two minutes, but it should work.

Mark-Simulacrum

Happy to merge as-is or with the every-2-minutes-when-idle hit.

Kobzol · 2024-01-15T21:07:19Z

I thought about it a bit more and I don't like the idle call, because calling onpush will basically clear the site's cache (both the DB index and the landing page). This would mean that if there's no collection going on, the site would be needlessly refreshing its cache all the time, that doesn't seem good.

So let's try the panic catching approach.

Kobzol added 3 commits January 12, 2024 13:04

Refactor collector result handling

65c438f

Extract async runtime creation into a separate function

c4897ba

Handle panics in bench_next

dca8f85

Kobzol requested a review from Mark-Simulacrum January 12, 2024 12:15

Mark-Simulacrum approved these changes Jan 15, 2024

View reviewed changes

Kobzol merged commit e982ff3 into rust-lang:master Jan 15, 2024
11 checks passed

Kobzol deleted the collector-catch-panic branch January 15, 2024 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catch panics in `collector` #1800

Catch panics in `collector` #1800

Kobzol commented Jan 12, 2024

Mark-Simulacrum commented Jan 12, 2024

Kobzol commented Jan 12, 2024

Kobzol commented Jan 12, 2024 •

edited

Loading

Mark-Simulacrum left a comment

Kobzol commented Jan 15, 2024

Catch panics in collector #1800

Catch panics in collector #1800

Conversation

Kobzol commented Jan 12, 2024

Mark-Simulacrum commented Jan 12, 2024

Kobzol commented Jan 12, 2024

Kobzol commented Jan 12, 2024 • edited Loading

Mark-Simulacrum left a comment

Choose a reason for hiding this comment

Kobzol commented Jan 15, 2024

Catch panics in `collector` #1800

Catch panics in `collector` #1800

Kobzol commented Jan 12, 2024 •

edited

Loading