You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The prometheus middleware exports a gauge metric dramatiq_delayed_messages_inprogress that is documented to track "The number of delayed messages in memory.". I noticed something appeared to be wrong with this metric because in my system the metric never goes down.
My setup:
dramatiq: 1.15.0
broker: redis
workers: 4
processes per worker: 2
threads per process: 1
Looking over the code I can see that the middleware is listening to the before_delay_message and before_process_message to track when a message is first delayed and then complete by tracking the message IDs. Looking at the actual worker implementation, however, it looks like messages get re-queued back in the broker when their eta arrives instead of being processed directly on the same worker process. (I also wonder if the message retains its same message ID upon being requeued, unsure)
All this seems to create a situation where the process that first marked a message as in progress to be unlikely to be the process that observes its completion, resulting in the metric not being tracked correctly. Additionally this whole process appears to leak memory (albeit slowly and not much) as the delayed message ID set grows over time.
Would be happy to contribute/discuss a patch to fix this metric if desired!
The text was updated successfully, but these errors were encountered:
msg555
changed the title
Prometheus middleware dramatiq_delayed_messages_inprogress metric does not work
Prometheus middleware dramatiq_delayed_messages_inprogress not tracking completed messages correctly
Jan 3, 2025
Issues
The prometheus middleware exports a gauge metric
dramatiq_delayed_messages_inprogress
that is documented to track "The number of delayed messages in memory.". I noticed something appeared to be wrong with this metric because in my system the metric never goes down.My setup:
dramatiq: 1.15.0
broker: redis
workers: 4
processes per worker: 2
threads per process: 1
Looking over the code I can see that the middleware is listening to the
before_delay_message
andbefore_process_message
to track when a message is first delayed and then complete by tracking the message IDs. Looking at the actual worker implementation, however, it looks like messages get re-queued back in the broker when their eta arrives instead of being processed directly on the same worker process. (I also wonder if the message retains its same message ID upon being requeued, unsure)All this seems to create a situation where the process that first marked a message as in progress to be unlikely to be the process that observes its completion, resulting in the metric not being tracked correctly. Additionally this whole process appears to leak memory (albeit slowly and not much) as the delayed message ID set grows over time.
Would be happy to contribute/discuss a patch to fix this metric if desired!
The text was updated successfully, but these errors were encountered: