Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process flows more efficiency in network observer #1940

Merged
merged 2 commits into from
Feb 10, 2025

Conversation

c-kruse
Copy link
Contributor

@c-kruse c-kruse commented Feb 3, 2025

The network observer handles an in memory buffer of application and transport level flow records and associated state that can be very large. This fix reorganizes the data structure that contains flow state and the reconcile tasks that operate on that state in order to avoid reallocating and copying the full collection on each iteration. Fixes #1942 by greatly reducing CPU cycles spent copying data structures and doing garbage collection.

@c-kruse c-kruse force-pushed the collector-flow-cache-iterator branch from e10fd57 to ad7c61a Compare February 3, 2025 22:37
@c-kruse
Copy link
Contributor Author

c-kruse commented Feb 4, 2025

Example resource usage for the latest v2-dev (yellow) network-observer vs pull-1940 (blue) on a VAN servicing around ~1000 requests per second.

Memory usage -25%
image

CPU usage - much better.
image

$ oc adm top pod
NAME                                          CPU(cores)   MEMORY(bytes)
skupper-router-84dc449fcd-gfzf4               250m         71Mi
pull-1940-network-observer-7dccf945f5-dnf5h   55m          1839Mi
v2-dev-network-observer-648f5f8f6c-vsxl6      701m         2398Mi
...

@c-kruse c-kruse force-pushed the collector-flow-cache-iterator branch from ad7c61a to ef45b13 Compare February 4, 2025 19:48
@c-kruse c-kruse marked this pull request as ready for review February 4, 2025 19:49
The network observer handles an in memory buffer of application and
transport level flow records and associated state that can be very
large. This changes how those collections are traversed for processing.
Previously the collections of flow state were copied on each reconcile
loop, resulting in high CPU and memory utilization. Now each reconcile
job limits itself to copying the relevant portions by using an iterator
and filters.

Signed-off-by: Christian Kruse <[email protected]>
@c-kruse c-kruse force-pushed the collector-flow-cache-iterator branch 2 times, most recently from 6018751 to 9887401 Compare February 5, 2025 23:01
@c-kruse c-kruse force-pushed the collector-flow-cache-iterator branch from 9887401 to 7c88cc3 Compare February 5, 2025 23:08
@c-kruse c-kruse self-assigned this Feb 5, 2025
@c-kruse c-kruse merged commit 0cc78af into skupperproject:main Feb 10, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

network-observer CPU utilization unreasonably high under load
2 participants