Support Webhooks for Kamaji Events #489
Replies: 2 comments 3 replies
-
Webhook integrations are not very common in Kubernetes controllers. I think the main reason is that Kubernetes already has good alternatives. |
Beta Was this translation helpful? Give feedback.
-
@JonnyBDev Thank you for giving Kamaji a try. Happy to see you’re finding it valuable for your project. As highlighted by @prometherion, implementing webhooks in Kamaji for notifications about status is out of scope and frankly not the best tech choice. However we totally understand your business need for status notification and, as it makes much more sense, we’re open to help you in developing a tool to achieve it. Feel free to reach us in private so we can jump on the discussion. Thank you. |
Beta Was this translation helpful? Give feedback.
-
We would love to see support for Webhooks in Kamaji for specific / all events from the operator. As of now the operator is doing a lot of actions but from an external perspective you can't really tell the status of these actions. Some examples for these actions are:
TenantControlPlane
TenantControlPlane
TenantControlPlane
Unfortunately we only know the status of the current action by polling the
TenantControlPlane
. And even with polling, we can not be 100% sure that the action has been executed due to potential delays in the operator. The only option we have right now is to delete theTenantControlPlane
. After theTenantControlPlane
is gone, we should consider it deleted and not available anymore. But that's just wrong. A deletion of aTenantControlPlane
triggers a reconciler cycle which will delete all resources in the given namespace that are connected to theTenantControlPlane
. If something goes wrong here or the operator is still working on other actions, we can not safely terminate the namespace due to finalizers. Not being able to delete resources due to running finalizers is not a good way to automated workflows like a deployment of a Kubernetes cluster. The risk of artifacts in a cluster with a high number of TCPs is pretty high in that scenario.One example would be the Benchmark with 100 TCPs. You stated that with 100 TCPs it would take around 7.5 minutes for ONE reconciliation cycle (4.5s / TCP). Let us draw an example usecase around this.
Imagine we would like to upgrade all our 100 TCPs to the next Kubernetes version. We would fire 100 API-Calls to k8s at once to modify the
TenantControlPlane
with the latest number. We would have to permanently poll all TCPs to get the current status. As you mentioned in your Benchmark, next up would be a test with 1000 TCPs which would, if we keep going with a linear increase, set the reconciliation cycle at 75 minutes. This would make mass updating clusters very complex and not efficient.We would like to see something implemented that would notify us instead of polling all TCPs. As described above all actions the operator executes should end in a configurable notification webhook. This way MSPs could just start like 1000 TCP-upgrades and get notified with the latest status after the upgrade happened. Another questionable thing would be the current state of the cluster. Is it "Ready" or "NotReady"? Is it experiencing problems? We would have to poll each TCP for the current status basically every 5 seconds instead of getting notified in realtime right after an event occured. Webhooks are the state of art technique to communicate events. You may say that this would be out-of-scope for Kamaji because you are just providing the tooling and not the platform around this. I think that's the wrong approach for a software like this. You are obviously providing a great tool but if you want Kamaji to be used by big MSPs, you should give them the tools to build an automation platform around this with features like Webhooks.
There are a lot of advantages in Webhooks vs polling. Just to name a few:
One example for a integration in the
tenentcontrolplane_controller.go
would be here. Beside logging the request in the container, just add a HTTP call to a configurable external webhook url. My approach would be that I'd notify for each event the operator receives. Just send a "Starting to delete TCP" (START) and "TCP started successfully"(END) webhook and everyone who uses and automates your software will be forever grateful.Thanks for reading this :)
Beta Was this translation helpful? Give feedback.
All reactions