consider removing the priority plugin feature holding jobs in PRIORITY when plugin has no flux-accounting data? #384

cmoussa1 · 2023-09-12T18:29:56Z

cmoussa1
Sep 12, 2023
Maintainer

I'm writing this as just a thought and opening this for discussion if anybody else has any thoughts... the more I work with the priority plugin and continue to add features, like supporting projects, job-updates, and more in the future, the more I wonder about the usefulness of the priority plugin to hold a job when it has no flux-accounting data (i.e user/bank data).

It was brought up a couple months ago that users felt confused seeing their jobs held in PRIORITY state on a cluster when their user/bank information was not yet entered into the flux-accounting database. Their job was accepted but would never run. So, it was proposed we revert back to the original behavior where if a user/bank submits a job but they do not have an entry in the flux-accounting DB, their job would be rejected with the following message:

$ flux submit my_job
no bank found for user: <my_userid>

but, if the plugin has no user/bank data at all, then any submitted job should be held in PRIORITY until it gets the required data.

The reason I feel like maybe the plugin should consider dropping this support of holding jobs in PRIORITY while it waits for any user/bank information is because it feels like a very rare case where the plugin would have no flux-accounting data and should hold a submitted job. The flux-accounting priority plugin is meant to be run in conjunction with a configured flux-accounting database, and the plugin should not be loaded if users/banks aren't set up. I could be wrong, but when flux-accounting is built and installed and cron jobs are set up, any available information in the database is immediately sent to the loaded priority plugin via etc/01-flux-account-priority-update:

#!/bin/bash

if test $(flux getattr rank) -eq 0 \
    && flux jobtap list | grep -q mf_priority.so; then
     flux account-priority-update || true
fi

I could also definitely be wrong about how flux-accounting is currently administered on our machines; perhaps the priority plugin is loaded but it could take a while for the flux-accounting database to be set up, users/banks to be added, and that information sent to the plugin while users are already trying to submit and run jobs (perhaps input from @ryanday36 might be helpful here?).

Removing this feature of holding jobs in PRIORITY because there is no information in the plugin might clean up the plugin code considerably, while at the same time, clearly define its purpose: if the plugin is loaded and there is no user/bank information, it should reject a job that is submitted because it cannot find the appropriate user and bank to associate the job with.

I could absolutely be missing a contrasting point here, though, and am more than open to any additional thoughts that others might have. Just figured I would try to write out my thoughts/opinion at the moment on where the priority plugin is at and how we would approach its behavior with handling user/bank data vs. no user/bank data.

grondo · 2023-09-12T20:32:38Z

grondo
Sep 12, 2023
Maintainer

I think the case to think about here is startup with pending jobs.

We should carefully step through how this works, but I think the whole purpose of the PRIORITY state was to allow for exactly this case, that is a priority plugin has been loaded but there is not enough data yet for the plugin to calculate the priority of a job. This could be because that initial RPC is still in flight, or we haven't reached the part of the rc script above.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider removing the priority plugin feature holding jobs in PRIORITY when plugin has no flux-accounting data? #384

{{title}}

Replies: 1 comment

{{title}}

Select a reply

consider removing the priority plugin feature holding jobs in PRIORITY when plugin has no flux-accounting data? #384

cmoussa1 Sep 12, 2023 Maintainer

Replies: 1 comment

grondo Sep 12, 2023 Maintainer

cmoussa1
Sep 12, 2023
Maintainer

grondo
Sep 12, 2023
Maintainer