-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add signals to expose internal Waitress events #388
Conversation
In order to expose how loaded waitress is, this commit adds a set of signals. They act as an API allowing external code to react to events like the creation of a channel or the start of execution of a task. This allows third-parties to develop extensions to track the load of Waitress via their favorite monitoring solutions. Support for signals depends on the availability of the "blinker" library. When it is not available, signals act as a no-op and cannot be subscribed to.
This is a proposal solution for #182 using signals. I included an example that uses the exposed signals to show the number of worker threads busy handling requests. Executing it and sending a few requests leads to:
Instead of just printing those, users could feed these numbers to Prometheus and start monitoring the load of their clusters. |
Using the
We want the listener to be able to know which server (which interface/port/socket) is emitting the events. The BaseWSGIServer has no public API and even if it did, it does not expose any useful information that relates back to either the specific socket it's serving, or the invocation of An alternative that is more explicit (and requires no dependencies) is: class EventListener:
def on_task_started(self, task):
pass
def on_task_finished(self, task):
pass
def on_channel_added(self, channel):
pass
def on_channel_deleted(self, channel):
pass
class MyEventListener(EventListener):
def on_task_started(self, task):
print(f'task started: {id(task)}')
serve(app, listen='*:8080', listener=MyEventListener()) The other piece of feedback I have right now is that this strategy exposes a lot of objects that are not currently public APIs. For example, BaseWSGIServer, Channel, Task. I hope that we can avoid this and only expose enough info to correlate the data, without allowing interacting with the objects. Thoughts? |
Definitely, I used that as a placeholder for now but a string identifying the server would be better. I was not sure if the
The example API you propose would work for this specific case but it has some limitations, on the top of my head:
I get the benefit of not introducing a dependency but I have the feeling that if we roll our own we will just end up with a worse reimplementation of blinker (which is a tiny lib with a permissive license).
That is a good point I had not considered, we should define what data each event receives. |
Basically the only one that isn’t an implementation detail of the user-defined EventListener in your examples is the serve issue. My question is: when do you not control the serve call but you’re comfortable starting a separate server on your own to expose metrics? Shouldn’t those lifecycles basically match or the metrics server should exist first in the process? |
True but I guess I am questioning a bit the point of providing an API so low-level that users have to implement their own solutions for things like using two integrations simultaneously. Given the fact that a more feature rich and battle tested API already exists and is ready to be used I am struggling to understand the hold up. Is adding an entirely optional dependency to waitress such a big deal?
I agree that this particular concern may be far fetched, but as a reminder, the Prometheus approach of exposing metrics via an HTTP endpoint is far from the norm. Sentry, Datadog, statsd... all work in the opposite direction and do not require a separate server. |
Using blinker or another listener API doesn't really change this. Waitress should just be emitting events when things happen and allow the listener to aggregate data just like in your examples. The fact that blinker allows multiple listeners is nice but not a dealbreaker.
I'm mainly hung up on the API and how to expose the information. Using Server:
Channel:
Task:
|
This is something that can be distributed outside of waitress, and means that simple integrations where there is no need to dynamically allow adding/removing of listeners and other complexities are handled by someone implementing them in their own I really don't like the idea of the code doing different things depending on whether or not some dependency is installed into the local python (conditional imports have caused me pain more than once), and I also don't like the idea of the additional complexity when it is not necessary for 99% of the use cases (simple emitter sending stats to statsd for instance, or JSON lines to vector). Even then there is nothing stopping you from creating an We already pass things down from serve through the adjustments internals and there is no reason why that couldn't be used for this use case either. Unfortunately stuff is already fairly intertwined/tangled together as is, so getting that stuff down as necessary is not an issue. It also allows you to trivially launch two different waitress servers in a single process without having to manually keep track of |
Got it, too bad I could not convince you of the awesomeness of signals. |
In order to expose how loaded waitress is, this commit adds a set of signals. They act as an API allowing external code to react to events like the creation of a channel or the start of execution of a task.
This allows third-parties to develop extensions to track the load of Waitress via their favorite monitoring solutions.
Support for signals depends on the availability of the "blinker" library. When it is not available, signals act as a no-op and cannot be subscribed to.