plugin: track the resources used across all of an association's running jobs #561

cmoussa1 · 2025-01-09T00:34:01Z

Problem

The priority plugin does not keep track of the resources used by an association across all of their running jobs, which will be needed when the plugin begins to enforce limits on how many resources they can have across all of their running jobs.

This PR looks to add the tracking component of an association's resource consumption across all of their running jobs. It does this by copying over the jj (along with a required streq function) code over from flux-core and utilizing the jj_get_counts_json () function to get a total nnodes and ncores count. When a job enters job.state.run, the association's current nnodes and ncores count will get incremented accordingly, and subsequently decremented when the job enters job.state.inactive.

I've also added some basic tests that check tracking resources across an association's set of jobs as they go through their lifecycle.

I probably need to hold this PR as [WIP] until I properly add the license for the code I copied over from ccan over in flux-core; for now, I just manually copied the streq () function to the jj.h file just to get a working solution, but should clean this up.

jameshcorbett

Looks good 👍 two little questions.

jameshcorbett · 2025-01-09T21:06:41Z

src/plugins/mf_priority.cpp

    b->cur_run_jobs--;
+    if (jobspec) {


under what conditions would jobspec be NULL? Just wondering because then a user might get stuck with the system permanently thinking they have active resources when in fact they don't.

The only case I can think of where jobspec might be NULL is if for some reason the jobtap plugin failed to unpack the jobspec in flux_plugin_arg_unpack (). Perhaps the plugin should raise an exception on the job if it fails to unpack it?

Pretty sure that can't happen - the "pack" operation would fail and not call this.

Oh you said unpack (sorry). You probably want to immediately fail if the unpack fails (like due to out of memory).

src/plugins/jj.hpp

garlick · 2025-01-09T21:55:29Z

For running jobs, would it make more sense to grab R instead of jobspec? In jobspec, the node count can be ambiguous if the user only requested cores, for example. But R always contains the actual count.

Problem: flux-core has some useful code for counting the resources for a job, but it needs to be manually copied over. Copy the jj.* files from flux-core and place them in the same folder as the priority plugin so that it can be compiled together and have its methods able to be accessed by the plugin. Manually copy over the streq() definition and place it in jj.hpp as well as copy over its LICENSE from flux-core. Reference it in jj.hpp.

Problem: An Association object has no way to track its current node and core count across all of its running jobs, which is needed in order to enforce limits in case a submitted job would put them over their max. Add two new members to the Association class, cur_nodes and cur_cores, which will store the current number of nodes and cores allocated across all of the association's running jobs, respectively. Update the unit test to account for the addition of the two new members added to the Association class.

Problem: The priority plugin does not keep track of the resources allocated per-job when an association submits a job, which will be needed when the plugin begins to enforce limits on how many resources an association can use across all of their running jobs. Add an increment/decrement of node+core counts in job.state.run and job.state.inactive by unpacking the jobspec of the job and calculating the number of nodes and cores assigned with the job. Increment and decrement these values in the "cur_nodes" and "cur_cores" attributes of the Association object.

Problem: There exists no tests for tracking the number of resources used across an association's running jobs and ensuring that the counts are as expected. Add some tests.

cmoussa1 · 2025-01-09T22:58:27Z

@garlick that's a good suggestion! I picked jobspec because I was able to use the jj_get_counts() function to get the resource counts, but I didn't know I could also do this with R. Is that the case?

Also, when the plugin ultimately needs to enforce a limit due to exceeding a certain max, I believe this check would need to happen in job.state.depend. I think the jobtap man page states that the R can only be accessed in RUN state or later. So, if I were to use R in one callback and jobspec in another, is there a chance I could end up with a discrepancy in the actual resource count? Apologies if I'm misunderstanding.

garlick · 2025-01-09T23:34:51Z

Yes, R is only available at RUN state.

On node scheduled, homogeneous systems, jobspec should be fine. That's everything we have in production today AFAIK.

However for core scheduled and/or homogeneous systems (different node:core counts depending on the node), if the user requests N cores, you may not know how many nodes that is until the scheduler assigns them.

Maybe that's OK for now where we are now!

cmoussa1 · 2025-01-09T23:49:16Z

Thank you for the clarification! Yes, the latter case is absolutely something to consider - I think at this point, just to make some progress on this front, maybe using jobspec (and/or making an estimation as best we can) should be okay.

cmoussa1 added new feature new feature plugin related to the multi-factor priority plugin labels Jan 9, 2025

cmoussa1 force-pushed the plugin.track.resources branch 2 times, most recently from e42e60e to a4fbc4e Compare January 9, 2025 20:40

cmoussa1 changed the title ~~[WIP] plugin: track the resources used across all of an association's running jobs~~ plugin: track the resources used across all of an association's running jobs Jan 9, 2025

cmoussa1 marked this pull request as ready for review January 9, 2025 20:51

cmoussa1 requested a review from jameshcorbett January 9, 2025 20:51

jameshcorbett approved these changes Jan 9, 2025

View reviewed changes

cmoussa1 added 4 commits January 9, 2025 14:52

t: add tests for tracking resources across jobs

4494d3c

Problem: There exists no tests for tracking the number of resources used across an association's running jobs and ensuring that the counts are as expected. Add some tests.

cmoussa1 force-pushed the plugin.track.resources branch from a4fbc4e to 4494d3c Compare January 9, 2025 23:12

cmoussa1 added the merge-when-passing label Jan 13, 2025

mergify bot merged commit 0ffd06e into flux-framework:master Jan 13, 2025
13 checks passed

This was referenced Jan 13, 2025

plugin: enforce max resource limits across an association's running jobs #559

Open

plugin: enforce max resource limits per-association #562

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plugin: track the resources used across all of an association's running jobs #561

plugin: track the resources used across all of an association's running jobs #561

cmoussa1 commented Jan 9, 2025 •

edited

Loading

jameshcorbett left a comment

jameshcorbett Jan 9, 2025

cmoussa1 Jan 9, 2025

garlick Jan 9, 2025

garlick Jan 9, 2025

garlick commented Jan 9, 2025

cmoussa1 commented Jan 9, 2025

garlick commented Jan 9, 2025

cmoussa1 commented Jan 9, 2025

plugin: track the resources used across all of an association's running jobs #561

plugin: track the resources used across all of an association's running jobs #561

Conversation

cmoussa1 commented Jan 9, 2025 • edited Loading

Problem

jameshcorbett left a comment

Choose a reason for hiding this comment

jameshcorbett Jan 9, 2025

Choose a reason for hiding this comment

cmoussa1 Jan 9, 2025

Choose a reason for hiding this comment

garlick Jan 9, 2025

Choose a reason for hiding this comment

garlick Jan 9, 2025

Choose a reason for hiding this comment

garlick commented Jan 9, 2025

cmoussa1 commented Jan 9, 2025

garlick commented Jan 9, 2025

cmoussa1 commented Jan 9, 2025

cmoussa1 commented Jan 9, 2025 •

edited

Loading