Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a korifi operator I want to be able to use my own log cache endpoint #3668

Open
georgethebeatle opened this issue Dec 17, 2024 · 1 comment

Comments

@georgethebeatle
Copy link
Member

georgethebeatle commented Dec 17, 2024

Dev Notes

  • Currently korifi implements the api/v1/read log cache endpoint in order to get logs for app
  • Currently korifi queries the metrics servce crds for pod stats in order to satisfy the processes/stats endpoint
  • In cf for vms process stats are also fetched from the log cache api.
  • We should implements the api/v1/query in the log cache handler and make the process stats repository loop back to this endpoint
  • Furthermore we should also make the logcache api endpoint configurable in the helm chart so that operators can override it
  • If no value is provided we will keep loooping back to the naive implementation by default
  • Log cache api spec
@chombium
Copy link

chombium commented Dec 18, 2024

Hi,

CF for VMs suports various log types which practically are the app logs themselves combined with the logs written from the platform components and a related to an app. An example would be: access logs, staging logs, app restarts, rescheduling and similar. All these logs are stored in Log Cache and are available through the Log Cache API. It would be nice to have all these app and app lifecycle logs available, but we should find out how and where to get them from.

I've tested and compared what kind of logs are being printed out with cf logs in CF-for-VMs and Korifi in/after different cf lifecycle events. Note: cf logs practically calls the Log Cache v1/read api for the given app with limit=200 parameter.

During my tests I've seen so far cf push output in Korifi is pretty similar if not identical to the one on CF-for-VMs. For the other commands like cf restage, cf restart Korifi only outputs the app logs and the other platform related logs have to be fetched via the pod events kubectl get events <pod>. In Korifi we have to practically combine data from various places to get a "cf logs" output as close to what we have on CF-for-VMs.

The other aspect to think about is that CF for VMs follows the Loggregator API format and observability metadata, if we want to use the same format and where and how do we get the data from.

In regards to the api/v1/query endpoint it is Prometheus compatible API, so if can make the metrics available via Prometheus we only need to map the api endpoint somewhere.

I guess, the /api/v1/meta would be out of scope if we simply read the logs from the k8s API server server. We could add something if needed afterwards.
Update: The meta endpoint show the state of Log Cache and how many data entries (logs and metrics) for apps and platform components are stored inside it. We could add something similar if we decide what to use as a caching layer if a cache is needed at all. If we don't need to cache anything, it is safe to leave out this endpoint

The one important thing to think about is how to collect and merge the logs and metrics in a single output(api) in case an app has multiple instances (a workload with multiple pods).

For implementation of the API we could take a look at the log-cache-release and the log-cache-cf-cli plugin. It will be interesting to check and decide if a simple API facade on top of the k8s API would be enough or we need a central cache component which will collect everything and serve the data with the Log Cache API.

Update: Log Cache uses Syslog to inject logs and metrics, but in k8s Syslog is barely used, so if we need to have a short term storage to serve the Log Cache API, we could take one of the shelf observability backends and put an API facade in front of it. It is suggested that Log Cache stores the data for at least 15 minutes.

btw. I'm one of the maintainers of the Loggregator, the CF's logging and metrics stack and I want to help ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🧊 Icebox
Development

No branches or pull requests

2 participants