Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INFRA-2821] Remove unused incrementals #2386

Closed
jenkins-infra-bot opened this issue Nov 19, 2020 · 25 comments
Closed

[INFRA-2821] Remove unused incrementals #2386

jenkins-infra-bot opened this issue Nov 19, 2020 · 25 comments
Assignees

Comments

@jenkins-infra-bot
Copy link

TBD whether we need to do this, but /incrementals repo uses 300GB, while actual /releases has just 220GB. If not now, then at some point in the not too distant future.

 


Originally reported by danielbeck, imported from: Remove unused incrementals
  • status: Open
  • priority: Minor
  • resolution: Unresolved
  • imported: 2022/01/10
@jenkins-infra-bot
Copy link
Author

danielbeck:

Jesse Glick What criteria would be useful to determine which incrementals to delete? Artifactory offers the following in metadata:

  • Created date
  • Number of downloads
  • Date of last download

The trivial one would be number of downloads = 0; but we may have scrapers than make this somewhat meaningless.

@jenkins-infra-bot
Copy link
Author

jglick:

From my PoV anything older than, say, a year can probably be deleted. Simple enough. In the unlikely event you still have an active PR referring to such an old version as a dependency which you are resurrecting work on, you should just switch the dep to a newer version (release or incremental, depending on whether or not the upstream PR has been released yet). You can also recreate the old incremental version locally (for non-CI exploration) via

cd …/upstream
git checkout $commithash
mvn -Dset.changelist -DskipTests install

@jenkins-infra-bot
Copy link
Author

jglick:

I should prioritize JENKINS-50804 though.

@jglick
Copy link

jglick commented Jun 27, 2022

prioritize JENKINS-50804

Long since done so as far as I am concerned there should be no obstacle to implementing a simple time-based expiry if there is any impetus to save disk space.

@dduportal dduportal added this to the infra-team-sync-2025-02-11 milestone Feb 7, 2025
@dduportal dduportal self-assigned this Feb 7, 2025
@dduportal
Copy link
Contributor

Putting this issue again on the Artifactory "2025 cleanup" EPIC (tbd by @MarkEWaite ).

@darinpope is leading this subject, I've invited him as "triage" in the helpdesk repository so we can assign him the issue (and audit will be followed here).

Ref. https://groups.google.com/g/jenkinsci-dev/c/6_X0ABdLAgE/m/3XwzSn1xEAAJ

@darinpope
Copy link
Collaborator

I did a quick one-off to verify my delete generator works as expected:

➜ jf rt del incrementals/org/jenkins-ci/main/jenkins-core/2.118-rc15700.24aa7b764a86 --url $ARTIFACTORY_URL --access-token $ARTIFACTORY_ACCESS_TOKEN --quiet   
07:58:09 [🔵Info] Searching artifacts...
07:58:09 [🔵Info] Found 1 artifact.
07:58:09 [🔵Info] [Thread 2] Deleting incrementals/org/jenkins-ci/main/jenkins-core/2.118-rc15700.24aa7b764a86/
{
  "status": "success",
  "totals": {
    "success": 1,
    "failure": 0
  }
}

@darinpope
Copy link
Collaborator

darinpope commented Feb 7, 2025

The current general plan is to delete anything in incrementals that is older than 28 days. Any objections to 28 days or do we want to go longer, i.e. 60, 90, xyz?

After we get everything cleaned up, we'll look at setting a lifecycle, i.e. delete anything older than 28 days (or whatever), on the repo.

@timja
Copy link
Member

timja commented Feb 7, 2025

The current general plan is to delete anything in incrementals that is older than 28 days. Any objections to 28 days or do we want to go longer, i.e. 60, 90, xyz?

The thing about incrementals is that they are release candidate versions.

People can use them for quite awhile as they are waiting for a fix to be released. i.e. They are supported in docker images via plugin-manager-cli.

With JEP-229 its gotten a lot better and its used less than it was for that but I don't know if a blanket time based policy is enough.

Can we do time based and usage? e.g. if 0 downloads then delete after 28 days.

@MarkEWaite
Copy link

Can we do time based and usage? e.g. if 0 downloads then delete after 28 days.

I don't object to the rule "if 0 downloads then delete after 28 days" so long as we also have "delete if older than 90 days". I think that one download of an incremental is not enough to justify retaining it forever.

@jglick
Copy link

jglick commented Feb 7, 2025

People can use them for quite awhile as they are waiting for a fix to be released.

This is intended to be used only for validating an open PR, for example one purporting to fix a bug that does not have clear steps to reproduce. A production controller should not be running an incremental plugin version indefinitely. We also no longer publish incremental releases from branches, only PRs, and if a PR you depend on critically has not been touched in months, it is time to adopt the plugin.

I think a simple time-based policy is fine, though a month seems a bit aggressive. If we just want to reduce 90% of the repo size (or whatever), we could give a longer grace period and see where that gets us.

@darinpope
Copy link
Collaborator

I can do and/ors in the AQL to cover (download=0 and exceeds short time) or (exceeds long time), but it sounds like just an exceeds long time might be a reasonable starting point.

Is everyone ok-ish with 90 days instead of 28 and then re-evaluate later?

@MarkEWaite
Copy link

I am OK with 90 days, then we re-evaluate

@dduportal
Copy link
Contributor

90 days sounds like a really good start!

@timja
Copy link
Member

timja commented Feb 7, 2025

Yeah 90 days should be fine.

@darinpope
Copy link
Collaborator

darinpope commented Feb 7, 2025

ok, I'll go with 90 days. Getting ready to make a pass at:

  • incrementals/org/jenkins-ci/main/jenkins-war
  • incrementals/org/jenkins-ci/main/jenkins-core/

FWIW, the last item outside of 90 days is:

  • incrementals/org/jenkins-ci/main/jenkins-core/2.479.2-rc35417.cd4cf63b_9e00
  • incrementals/org/jenkins-ci/main/jenkins-war/2.479.2-rc35417.cd4cf63b_9e00

going back to

  • incrementals/org/jenkins-ci/main/jenkins-core/2.118-rc15700.24aa7b764a86
  • incrementals/org/jenkins-ci/main/jenkins-war/2.118-rc15700.24aa7b764a86

@darinpope
Copy link
Collaborator

delete of jenkins-core files ended around 2:35p Central. Total directories deleted = 8954

@darinpope
Copy link
Collaborator

delete of jenkins-war files ended around 4:45p Central (2h4m) Total directories deleted = 8954

@darinpope
Copy link
Collaborator

darinpope commented Feb 7, 2025

so at this point (round numbers) we're down to ~765GB from 1.7TB.

I focused just on those 2 folders. Now I'll go broad in incrementals and see how many more we can get rid of.

Image

@dduportal dduportal removed this from the infra-team-sync-2025-02-11 milestone Feb 11, 2025
@dduportal dduportal added this to the infra-team-sync-2025-02-18 milestone Feb 11, 2025
@darinpope
Copy link
Collaborator

I haven't gone dark on this. I've been slowly but surely dumping the "older than 90 days" from incrementals. There are over 500k files that are eligible for deletion, so it's taking some time to go through them.

I'm doing it 50k at a time as recommended from some JFrog documentation. On my connection, it takes about 8 hours for 50k to process, so I'm getting about 100k deleted each day.

@darinpope
Copy link
Collaborator

After quite a few days, we've got incrementals caught up to only have files that were created within the last 90 days. As of Sunday, 2025-02-16, we're down to 52,735 files using 81.24GB, down from the grand total of 1.7TB.

Image

@timja
Copy link
Member

timja commented Feb 16, 2025

Sounds reasonable, anything else to do before closing this?

@darinpope
Copy link
Collaborator

good question. Should we use this same issue to setup a scheduled job to keep incrementals clean or should we setup a different issue?

@timja
Copy link
Member

timja commented Feb 16, 2025

I would have a preference for scoped smaller issues that are clear when complete, i.e. a new one.

@dduportal
Copy link
Contributor

+1 with @timja : can we make a new issue for the GC?

@dduportal
Copy link
Contributor

Next steps in #4533

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants