Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean docker image cache #1636

Merged
merged 178 commits into from
Apr 17, 2024
Merged

Conversation

wind57
Copy link
Contributor

@wind57 wind57 commented Apr 12, 2024

No description provided.

wind57 and others added 30 commits December 4, 2021 07:59
@@ -0,0 +1,7 @@
busybox:1.35
Copy link
Contributor Author

@wind57 wind57 Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I have realized is that we download docker images specific to integration tests all the time. I am proposing to cache them via github actions instead.

In order to do that, we are going to track what images we use and their versions in this file. It is used in the pipeline as the key of the cache : docker-images-cache-${{ hashFiles('**/current-images.txt') }}.

If this file changed, we will re-download them; it if stays the same and such a cache is present, it will download from the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend unit testing this somehow. Possibly you can make a test that dumps the images from k3s ctl and fails if it has something that isn't in here or visa versa? Just a thought, as this is very likely to drift.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another way could be to make this like DockerImages.go, and give it a main function that prints out a list like this. Then, as long as you checked out the source first and installed go, you could run that main and also have no drift.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a helper, really. It is supposed to help with the times we run integration tests (that we plan to reduce in the long run anyway). So even if an image is not downloaded from the cache, it is still going to be pulled in the test. I, personally, would not test this "too much" and a few runs that I had proved that it works exactly as intended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. I meant DockerImages.java ;) ok you can also solve this by making a comment in the java code that defines the istio versions saying "// please update current-images.txt when you update this"

because at the end of the day, it is about drift making the infrastructure added here serve no value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you look closely into Commons.java it does not have any versions anymore. That is why I introduced Images.java - that is the sole entry point that deals with versions of images that we use in integration tests. That is also the reason why deployment manifests don't have versions, I read them from that current-images.txt

Copy link
Contributor Author

@wind57 wind57 Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neither does Fabric8IstioIT : no version there either. Even better, we do not need pilot and proxyv2 at all, so I dropped those. All I need is your PR to be merged first :)

P.S. I doubt that I want to wait until that PR in test-containers is reviewed and released, so I am inclined to suggest merge/review this one the way it is. When that PR sees day light, I will go for another round of minor refactor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

path: /tmp/docker/images
key: docker-images-cache-${{ hashFiles('**/current-images.txt') }}

- name: pull docker images
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if cache is not present, issue docker pull and docker save into /tmp/docker/images.

It also replaces the / with - in the docker image, otherwise it can't be saved

@@ -13,7 +13,7 @@ spec:
spec:
containers:
- name: service-wiremock
image: wiremock/wiremock:3.4.2
image: wiremock/wiremock
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't provide versions here, but read what these versions are from current-images.txt using a newly introduced class : Images.

/**
* @author wind57
*/
public final class Images {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class is the sole entry point when we need to find a certain version of an image, or load a container into kind

Optional<String> found = Arrays.stream(tars).map(File::getName).filter(tarName::contains).findFirst();
if (found.isPresent()) {
LOG.info("running in github actions, will load from : " + Commons.TMP_IMAGES);
Commons.loadImageFromPath(found.get(), container);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this Commons.loadImageFromPath(found.get(), container); will only happen when running inside github actions, because we create a path /tmp/docker/images with all the tar archives in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2p is to make the GH workflow do this, as it is distracting in the normal code. In my go code, I only pull if the image doesn't already exist (or it is a locally built one). So, if you can have in the GH action the images safely restored from tar, then "pull if not exists" here for the non-test/tagged images will be simpler code. Also, it centralizes the responsibility vs picking a magic tar directory. When I look at other code that uses k3s or testcontainers it is a lot simpler than the code here, and I think we need to keep in mind how difficult this is vs the intent of testcontainers (just use it sort of thing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I got your point correct. You are suggesting to do the "docker load" part in the GH action itself, right? If so, good point, I did not think about it this way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah in fact when I have something complicated for GH action, I put it into a shell script and have the GH action call that. This way it is easier to ad-hoc test. I have it on my TODO to look at https://dagger.io/ instead of this approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a closer look, and I don't think that your suggestion is possible. At this point, there is no instance of k3s starter, so no way to import an image via ctr. I think for such a thing to happen, we would need to re-think the way we do it.

I am really interested how you are doing it to be honest and where I can take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sorry it took a while to respond, I had to recollect why it is like this, and create that PR. Your comments triggered that, so thank you :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for raising it. Most of the time testcontainers-java is ahead of go, and this is a rare place where we need to catch up the main one. 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neah, thank you actually. I never saw the problem from this angle until you came here, so all the credits go to you. kudos!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear when we run the tests locally (or internally on Jenkins) we will still pull the images like we did before because this cache will never exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's correct. it's like running the test locally, without having the images already pulled

@@ -22,6 +22,9 @@ runs:
- name: cache local maven repository
uses: ./.github/workflows/composites/cache

- name: restore common images
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new stage that basically does this:

  • if a cache exists with key docker-images-cache-${{ hashFiles('**/current-images.txt') }} use it and restore to /tmp/docker/images
  • if such a cache does not exist, read what is in current-images.txt line by line. For each of those lines, do a docker pull then a docker save to /tmp/docker/images
  • if the above step happened, save to cache the directory /tmp/docker/images

@@ -73,6 +74,8 @@ static void beforeAll() throws Exception {
Commons.validateImage(IMAGE_NAME, K3S);
Commons.loadSpringCloudKubernetesImage(IMAGE_NAME, K3S);

Images.loadBusybox(K3S);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load from either /tmp/docker/images (if it runs on github) or do a docker pull/docker save/docker load, just like we used to do until now

@wind57 wind57 marked this pull request as ready for review April 13, 2024 11:17
@wind57
Copy link
Contributor Author

wind57 commented Apr 13, 2024

@ryanjbaxter this is ready to be looked at, but the other PR from @codefromthecrypt should probably go in first. Thank you

@wind57 wind57 marked this pull request as draft April 15, 2024 06:02
@wind57 wind57 marked this pull request as ready for review April 17, 2024 13:48
@wind57 wind57 requested a review from ryanjbaxter April 17, 2024 13:48
@ryanjbaxter ryanjbaxter added this to the 3.1.2 milestone Apr 17, 2024
@ryanjbaxter ryanjbaxter merged commit 013b920 into spring-cloud:main Apr 17, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants