-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
database: add the ability to remove old job records from jobs
table
#459
Conversation
38a6d98
to
b98595b
Compare
b98595b
to
ee6184c
Compare
when do you expect users to run this command? I didn't know much about VACUUM but this on the sqlite website concerned me
suggesting to me that scrub should only be done on a non-live system? |
I only expect this to be run by admins on an infrequent periodic basis when they no longer require jobs past a certain date. Maybe I could just remove the |
Yeah, I think that's a better idea, just documenting how to do it manually. Also, do we know how long VACUUM takes? I suspect it can take a while so maybe putting it in the code is a bad idea?? |
86f8fbe
to
7aa8523
Compare
OK sounds good. I've just force pushed a change to remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good o me. But I can't help but wonder if a "flux scrub-job-records" command would be better?
I guess the question is this scrubbing a one-off thing that you expect admins to do (presumably) during a downtime/maintenance/cleanup/something?
Or do you expect people to add this to a cron along with the normal call to flux account-fetch-job-records
?
If the former, maybe a new command? If the latter, then I think this is good.
Just a subtle design choice to consider. You would know best on how people would probably use this.
That's a good point. I guess I was expecting this to be added onto the set of cron jobs already running for flux-accounting, i.e every time new jobs were fetched and stored into the flux-accounting DB, old ones could be removed. That way, admins don't have to worry about running a separate command to clean the DB of records themselves. However, I could absolutely see wanting to make this a one-off command that is only run every so often. In that case, I can just make this a separate command entirely. @ryanday36 might you have a preference on what the approach is here? Would you rather scrubbing of old jobs occur periodically and in the background (with every call to |
My first thought is that I don't have a strong preference for |
I think that's totally fair. I'll go ahead and modify this PR to instead just introduce a separate command instead of tacking on an optional argument to |
Problem: There currently exists no way to clean the "jobs" table in the flux-accounting DB of job records that no longer affect a user/bank's job usage and fairshare values. This means that as time goes on, the "jobs" table will grow without bound as more and more jobs are run on a system. Add a new command to flux-accounting called "scrub-old-jobs" which takes an integer argument representing the number of weeks to go back. It will remove any job records older than the number of weeks passed in, e.g "flux account scrub-old-jobs 4" will remove any job older than 4 weeks old.
Problem: There is no test confirming that only privileged users can run flux account scrub-old-jobs. Add a test.
7aa8523
to
5f187ff
Compare
@@ -0,0 +1,75 @@ | |||
#!/bin/bash | |||
|
|||
test_description='test removing old job records from the flux-accounting database' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think a test for the default (no args == 26 weeks) is missing? Maybe should stick one more test record in there for something > 6 months old.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, probably should cover the path argument too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks. I just pushed up a test that does both - specifying a path to the DB as well as the default for scrub-old-jobs
(which should remove a simulated job that "finished" over six months ago).
Problem: There exists no tests for the scrub-old-jobs command. Add some tests.
Problem: The flux-accounting guide does not make a note of the scrub-old-jobs command or running VACUUM on the database file if space needs to be reclaimed. Add a note to the flux-accounting guide about what the scrub-old-jobs command does and how to tune it. Make a note about connecting to the database file and running VACUUM if we want to clean up and optimize the space occupied by the file.
5f187ff
to
d8382fa
Compare
I've restructured this PR to instead add a new command |
@chu11 is this OK to go in? Sorry, I know you had already approved before I restructured this PR to use a separate command. |
oh yes, I think it's good to go! |
Thanks! Setting MWP here |
Problem
As it stands right now, the
jobs
table in the flux-accounting database grows without bound as more and more jobs finish running and are fetched by flux-accounting. However, at a certain point, job records become irrelevant for flux-accounting in terms of calculating job usage and fair-share values.This PR adds a new command to flux-accounting called
scrub-old-jobs
which takes an integer argument representing the number of weeks to go back. It will remove any job records older than the number of weeks passedin, e.g
flux account scrub-old-jobs 4
will remove any job older than 4 weeks old. If no argument is specified, the default will be 26 weeks (or 6 months).Some basic sharness tests are added that add some fake job records to the flux-accounting database and run the
scrub-old-jobs
command to test removing records.A note about this command and optimizing the memory occupied by the database file by running
VACUUM
is also added to the flux-accounting guide.