Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live debugging service with prometheus.relabel #797

Merged
merged 12 commits into from
Jun 3, 2024
Merged

Live debugging service with prometheus.relabel #797

merged 12 commits into from
Jun 3, 2024

Conversation

wildum
Copy link
Contributor

@wildum wildum commented May 8, 2024

THIS PR WILL BE MERGED TO A FEATURE BRANCH WHICH WILL RECEIVE A FEW OTHER PRs BEFORE BEING READY TO BE MERGED ON MAIN

This PR is the first part of the live debugging functionality.

It creates live debugging as a service and connects the prometheus.relabel component to it (an easy first candidate).

A stream to a component can be opened via http://localhost:12345/api/v0/web/debug/ + componentID.
This adds a callback to the x-ray service that the component can retrieve and use to pass data as a string which will be directly flushed to the HTTP client.

It supports components that are inside modules.

Fixes #790

@wildum wildum changed the title Xray service with prometheus.relabel Live debugging service with prometheus.relabel May 8, 2024
@wildum wildum force-pushed the xray-service branch 2 times, most recently from da55944 to 47650ee Compare May 8, 2024 14:55
internal/web/api/api.go Outdated Show resolved Hide resolved
@mattdurham
Copy link
Collaborator

Looks reasonable, tested both wide 1,000 streams and deep 1 stream with 1m metrics. Long term some changes around when to flush and the channel would be nice. But performance is totally reasonable.

@rfratto rfratto added the backport-to-agent:no PR should NOT be backported to the agent repo. label May 14, 2024
Copy link
Contributor

@thampiotr thampiotr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start! some comments around readability and naming.

internal/alloy/componenttest/componenttest.go Outdated Show resolved Hide resolved
internal/alloycli/cmd_run.go Show resolved Hide resolved
internal/component/prometheus/relabel/relabel.go Outdated Show resolved Hide resolved
internal/component/prometheus/relabel/relabel.go Outdated Show resolved Hide resolved
internal/component/prometheus/relabel/relabel.go Outdated Show resolved Hide resolved
internal/service/ui/ui.go Outdated Show resolved Hide resolved
internal/web/api/api.go Outdated Show resolved Hide resolved
internal/service/livedebugging/debug_stream_manager.go Outdated Show resolved Hide resolved
internal/service/livedebugging/debug_stream_manager.go Outdated Show resolved Hide resolved
internal/service/livedebugging/debug_stream_manager.go Outdated Show resolved Hide resolved
@wildum
Copy link
Contributor Author

wildum commented May 24, 2024

@ptodev @thampiotr thanks for your feedback!
I made some changes to try to improve naming and readability. I also split the interface in two: DebugCallbackRegistry and DebugDataPublisher.
I'm now using two interfaces to clearly separate the roles between the debug data publishers (the components) and the API that sets the callbacks. I'm not sure where these interfaces should be defined so I left them in the livedebugging package.
Please have another look at it when you have time :)

Copy link
Collaborator

@ptodev ptodev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LGTM!

}

// Buffer of 1000 entries to handle load spikes and prevent this functionality from eating up too much memory.
// TODO: in the future we may want to make this value configurable to handle heavy load
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should track these TODOs somewhere in a GitHub issue, cause after we're done reviewing these partial PRs, we may miss them when merging to main.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I created this issue to track the todos: #974
Not sure if we do the optimizations before we merge it to main. It might be better to have more components that supports the feature to identify and solve the pain points

@wildum wildum merged commit 1523af0 into xray Jun 3, 2024
15 checks passed
@wildum wildum deleted the xray-service branch June 3, 2024 10:22
wildum added a commit that referenced this pull request Jun 7, 2024
* create xray service to manage the debug streams

* add xray support to the prometheus.relabel component

* add xray endpoint to the api

* rename handler to manager

* defer cleanup func

* rework to add multi-streams support

* more tests

* additional multi delete test

* add todo comment for buffer size

* add error checks

* improve naming and readability, split interfaces and add a check to avoid expensive string computation

* move register and isregistered to a new debugRegistry interface
@wildum wildum mentioned this pull request Jun 7, 2024
5 tasks
wildum added a commit that referenced this pull request Jun 11, 2024
* Live debugging service with prometheus.relabel (#797)

* create xray service to manage the debug streams

* add xray support to the prometheus.relabel component

* add xray endpoint to the api

* rename handler to manager

* defer cleanup func

* rework to add multi-streams support

* more tests

* additional multi delete test

* add todo comment for buffer size

* add error checks

* improve naming and readability, split interfaces and add a check to avoid expensive string computation

* move register and isregistered to a new debugRegistry interface

* Live debugging front (#977)

* add live debugging page in UI

* change button size

* move icon in front of text in buttons

* add warning if the number of incoming lines is greater than the buffer size

* Live debugging doc (#987)

* add live debugging doc

* Update docs/sources/tasks/debug.md

Co-authored-by: Clayton Cornell <[email protected]>

* Update docs/sources/tasks/debug.md

Co-authored-by: Clayton Cornell <[email protected]>

* Update docs/sources/tasks/debug.md

Co-authored-by: Clayton Cornell <[email protected]>

* improve doc

* Update docs/sources/tasks/debug.md

Co-authored-by: Clayton Cornell <[email protected]>

---------

Co-authored-by: Clayton Cornell <[email protected]>

* changelog

* add list of supported components in debug.md

* switch to pull based registration

* switch stability to GA

* update changelog

* use monospace font for log lines

* copy to clipboard now copies the filtered data

* disable livedebugging by default and add a config block to enable it. The stability is also updated to experimental

* Update docs/sources/reference/config-blocks/livedebugging.md

Co-authored-by: Clayton Cornell <[email protected]>

* Update docs/sources/reference/config-blocks/livedebugging.md

Co-authored-by: Clayton Cornell <[email protected]>

* replace debugging data by live debugging data in the doc

* update screenshots

* Update docs/sources/reference/config-blocks/livedebugging.md

Co-authored-by: Clayton Cornell <[email protected]>

* Update docs/sources/reference/config-blocks/livedebugging.md

Co-authored-by: Clayton Cornell <[email protected]>

* small doc fix

---------

Co-authored-by: Clayton Cornell <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport-to-agent:no PR should NOT be backported to the agent repo. frozen-due-to-age
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants