Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache terraform providers to reduce transient errors #41

Closed
chanwit opened this issue Jan 19, 2022 · 8 comments
Closed

Cache terraform providers to reduce transient errors #41

chanwit opened this issue Jan 19, 2022 · 8 comments
Labels
Milestone

Comments

@chanwit
Copy link
Collaborator

chanwit commented Jan 19, 2022

No description provided.

@chanwit chanwit changed the title Cache terraform module to reduce transient errors Cache terraform providers to reduce transient errors Jan 19, 2022
@chanwit chanwit added the kind/enhancement New feature or request label Jan 23, 2022
@chanwit chanwit added this to the Q2-2022 milestone Feb 15, 2022
@ekristen
Copy link

ekristen commented Aug 3, 2022

@chanwit have you thought about solutions to this yet? This is definitely a tricky aspect. One thought I had was to develop a second "source" controller who's job it is to run terraform init for each version of the source code reference and then reference it like kustomize controller does when sourcing an artifact from the source controller.

@chanwit
Copy link
Collaborator Author

chanwit commented Aug 4, 2022

I would tackle this problem using a simple image-based approach.
We had a plan to have tfctl image build command, for example, for users to opt-in:

  • Their custom version of Terraform binary
  • Their pre-loaded set of providers

One of the goals is also the enablement of TF-controller in airgap environments.

@ekristen
Copy link

What about a terraform source controller that can take the source artifact from the source controller, run terraform using tfexec (so the version is selected automatically) and grabs all the providers and then creates a tar.gz like the source controller, from there the tf-controller uses it?

@ekristen
Copy link

@chanwit thinking about this more, I wonder if by using tfexec and including the latest patched terraform binary in the docker image 1.0.9, 1.1.9, 1.2.4 and then making use of the terraform network mirror with an nginx config, we could pass through cache the files to disk.

There's a terraform installation network mirror setting that can be dropped into the .tf code that can signal that a mirror should be used, if it's air-gapped this could be pre-populated static website, if this is online, it could be a pass through nginx cache and so then all the providers would just get cached locally on the cluster.

@ekristen
Copy link

The only catch is that terraform forces an https endpoint for network mirrors at the moment. Would be nice if http was supported then it could just use a proxy in the cluster.

@lasomethingsomething
Copy link
Contributor

Users found a way to add the provider files into the images and correctly configure TF to use the bundled providers, therefore we'll close.

@thejosephstevens
Copy link

thejosephstevens commented Dec 6, 2024

edit: I figured out how to get it done, I'll make a docs PR as a walkthrough!


Did anyone get details on how the user accomplished this? (cc @lasomethingsomething ) I've been trying to do the same thing to cache providers in my runners because they're generating a ton of cost on NAT from pulling providers, but so far I'm struggling to find logs that tell me exactly what's happening (TF_LOG is unusable because tfexec blocks it, and I'm not seeing any way to increase logging through the controller). I just found the way to mount the cli config file through the terraform file, so I think it's being used now, but I'm not seeing any reduction in container network traffic.

For the same reason as ekirsten described above, it's a pain to set up a network mirror, but it's really easy to preload providers into the runner image (and that benefits from existing mechanisms for caching container images including fast boots from pre-pulled images).

I'm going to start working on a PR to explicitly support this, but if anyone has details on the fix the user accomplished that would be really useful, because that's exactly the route I'm trying to use. My current config is this:

# Create a local cache dir
# This matches the path logic in the tf-runner Dockerfile
plugin_cache_dir   = "/home/runner/.terraform.d/plugin-cache"
# Disable reaching out to Hashicorp for upgrade and security checks
disable_checkpoint = true

@thejosephstevens
Copy link

Following up, I added documentation in a PR here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants