Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add grow and shrink functions to REAPI #1316

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

milroy
Copy link
Member

@milroy milroy commented Dec 4, 2024

This PR exposes grow functions in the C and C++ REAPI bindings.

The grow functionality passes a JGF subgraph including the path from the cluster vertex to the subgraph root. For example, a JGF subgraph with a new node (newnode) and subnode resources to cluster0 at rack0 includes the cluster and rack vertices as well as the induced edges. The disadvantage of this approach is that the vertex metadata in the JGF subgraph needs to be sufficiently specified to identify the vertices that already exist in the graph (e.g., cluster0).

The PR is WIP, because we'll need to sort out whether to implement the REAPI module functions and determine if it's preferable to pass a path to the attachment point rather than include the path in the JGF subgraph.

@milroy milroy force-pushed the grow-api branch 2 times, most recently from df1e228 to 10f0f74 Compare January 15, 2025 02:27
@milroy milroy changed the title [WIP] Add grow functions to REAPI [WIP] Add grow and shrink functions to REAPI Jan 15, 2025
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 2.98507% with 65 lines in your changes missing coverage. Please review.

Project coverage is 75.0%. Comparing base (5ae7459) to head (10f0f74).

Files with missing lines Patch % Lines
resource/reapi/bindings/c++/reapi_cli_impl.hpp 0.0% 65 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master   #1316     +/-   ##
========================================
- Coverage    75.3%   75.0%   -0.4%     
========================================
  Files         111     111             
  Lines       16042   16109     +67     
========================================
+ Hits        12081   12083      +2     
- Misses       3961    4026     +65     
Files with missing lines Coverage Δ
resource/readers/resource_reader_jgf.cpp 71.0% <ø> (ø)
resource/utilities/command.cpp 77.8% <100.0%> (+<0.1%) ⬆️
resource/reapi/bindings/c++/reapi_cli_impl.hpp 34.1% <0.0%> (-5.5%) ⬇️

@vsoch
Copy link
Member

vsoch commented Jan 15, 2025

Flux in Kubernetes has told me "Yeaaaah man, I'm gonna get YUUUUGE!"

@vsoch
Copy link
Member

vsoch commented Jan 15, 2025

This is tested (and the basics are working in fluxion-go)! 🥳

We start with cluster tiny, one rack and two nodes (node0 and node1). We demonstrate that if we ask for 4 nodes, we cannot be satisfied:

Asking to MatchSatisfy 4 nodes (not possible)

        ----Match Satisfy output---
satisfied: false
error: <nil>

We then ask fluxion to grow from 2 to 4 nodes. We do that with a new nodes request that includes an existing path in the graph, tiny0->rack0 but then defines two new nodes, node2 and node3.

🍔 Asking to Grow from 2 to 4 Nodes
Grow request return value: <nil>

We can get some verification that the graph node has node0-3 (4 nodes) by asking for 4 nodes with satisfy again. We are satisfied!

Asking to MatchSatisfy 4 nodes (now IS possible)

        ----Match Satisfy output---
satisfied: true
error: <nil>

We now want to test shrink. Shrink takes the node path that we should prune at. For this case, we just prune off one node.

🥕 Asking to Shrink from 4 to 3 Nodes
Shrink request return value: <nil>

When we have 3 nodes we ask for 4 again, and we are no longer satisfied.

Asking to MatchSatisfy 4 nodes (again, not possible)

        ----Match Satisfy output---
satisfied: false
error: <nil>

The testing shown above is here in GitHub CI (shown across OSes) and that full PR is here and will just need to be updated when the branch here is merged (or we can work from this branch if that isn't going to happen soon - I can build/deploy a custom container that has it.

How will shrink work in Kubernetes?

High level - I'm thinking through the shrink design for a cluster in Kubernetes, and I think we have two use cases (that warrant different design strategies):

1. A need to shrink down in increments of 1

This will work fine to prune single nodes.

2. A need to shrink down in unknown increments (>1)

This could be a lot of requests to fluxion, for example, if we want to shrink down by 10, 20, or more nodes at once. We have a few options, I think. I'll think through each one.

  • We can expose a function in fluxion-go that takes a list of nodes, and then (within the same call to fluxion-go) makes multiple calls to fluxion. That mostly replaces the need to do many grpc requests (one per node) with just one (to handle many requests, internally).

  • If we have an understanding of the increments, we can design a cluster graph with abstract levels of racks, where each rack is a group of nodes that are intended to be brought down together. Maybe we would design that graph based on topology. That way, we can do one request to trim the rack, and it will cut off all the nodes.

I haven't looked at what fluxion is doing in terms of the actual shrink (happening during a traversal?) but if there is an operation that can handle multiple cuts at once (e.g., done during one traversal) that could be an idea. But based on my impression of infrequent scaling, I don't really think optimizing this up the wazoo right now is that necessary.

How will grow design work in Kubernetes?

For the first case (a flat topology that has nodes added to it) we likely just need to get the highest node identifier in the graph, and then we generate the json request that adds 1 to that (and however many we need). That can be stored as a variable somewhere, and on restart it can be calculated again from the live cluster.

For the second case that has multiple racks each with children, we would likely apply the same strategy as above, but on the level of the rack, and then the number of children under a rack is constant. We would calculate the node indices based on the number of racks and expected nodes per each. In the case that we have different numbers of nodes per rack (e.g., imagine different applications requiring different sized increments) we could label the rack with metadata that says exactly the number of children that are there.

Anyway - there are many ways to skin an avocado! Thanks for finishing this up @milroy I'm super pumped to see it working, and (TBA) to get it merged and deployed and into some of our projects!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants