Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce CI for AWS - part 1 #2274

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

wainersm
Copy link
Member

@wainersm wainersm commented Feb 3, 2025

This is just part 1 of a series of commits to run the e2e tests nightly for AWS too.

The way it's now, tests are executed but some fails. In particular, the TestAwsCreatePeerPodWithLargeImage fails in such as bad fashion that the job gets cancelled. The good news is that the simple pod tests at least pass. Here is an execution on my fork: https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/13122445400

Below is a list of things I still have to work on to make it acceptable running alongside our CI (at this point, the job will skip because I won't configure the AWS credentials on this repo yet). Nevertheless, I'd like to have this part merged because there are other aws-unrelated changes I plan to submit to the workflows and I want to avoid keep rebasing & resolving conflicts in my fork.

What's next:

  • backup code to delete the created resources on AWS because there are some occasions where the deletion code of the e2e framework doesn't run, for example, the failure of the TestAwsCreatePeerPodWithLargeImage I mentioned below causes that problem.
  • deal with the failing tests. Either disable or fix them.
  • run with CRI-O
  • make it more resilient. For example, sometimes the VPC is created on an Availability Zone where the default podvm instance type isn't available, so it fails all tests
  • [updated] add a debug step
  • [updated] make it work with mkosi podvm images
  • [updated ] move common code to scripts

@wainersm wainersm added CI Issues related to CI workflows provider/aws Issues related to AWS CAA provider labels Feb 3, 2025
@wainersm wainersm requested a review from a team as a code owner February 3, 2025 22:07
Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily for this PR, but I'm wondering if we can separate some of the steps into scripts given that lots of this is duplicated with the other providers. I know we've discussed it before, but I can't recall if we deliberately rejected it, or not

name: aws
if: |
github.event_name == 'workflow_dispatch'
needs: [podvm, image, prep_install]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does AWS only work with the packer build, or also mkosi?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I forgot to say on the description...I couldn't make it work with mkosi and it's in my list to debug. The workflow supports both packer and mkosi image though.

export TEST_PODVM_IMAGE="${{ env.PODVM_QCOW2 }}"
export TEST_E2E_TIMEOUT="90m"

make test-e2e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get any debug logs in case things go wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I overlooked it completely. In a next PR hopefully

@wainersm
Copy link
Member Author

wainersm commented Feb 4, 2025

Hi @stevenhorsman !

Not necessarily for this PR, but I'm wondering if we can separate some of the steps into scripts given that lots of this is duplicated with the other providers. I know we've discussed it before, but I can't recall if we deliberately rejected it, or not

IMO, we can and must separate into scripts. Share common code and avoid the nasty pull_request_target limitation. I can give it a try on a following PR.

Created a callable workflow for running the AWS e2e tests. This initial
implementation has support for testing mkosi or packer based images, being
default the later.

The cluster_type has only support to "onprem" cluster, and the workflow
will create a kcli-based kubeadm one.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
The new created e2e_aws is called by the e2e_run_all, so AWS e2e tests
will run on nightly. At this point it won't be triggered by pull request.

It's testing the packer based podvm images.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Tagging with "Name" all the AWS resources created to help on
tracking and removal of them all, mainly when running on CI.

In order to tag images I had to bump github.com/aws/aws-sdk-go-v2/service/ec2
which cascated to updating other modules.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
So to generate unique names to avoid clashing published images.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
So that on VPC teardown (if enabled) the created AMI will be deleted along with
its corresponding EBS snapshot.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Delete the key from bucket that contains the raw disk image.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@wainersm
Copy link
Member Author

wainersm commented Feb 4, 2025

Updated with fixes to workflow lint errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Issues related to CI workflows provider/aws Issues related to AWS CAA provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants