Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pod identity credential integration test #450

Merged
merged 5 commits into from
Jan 24, 2025
Merged

Conversation

zhihonl
Copy link
Contributor

@zhihonl zhihonl commented Jan 16, 2025

Description of the issue

Pod identity credential support is added to CloudWatch Agent and FluentBit, however we lack integration test.

Description of changes

  • Add new terraform folder for credential related testing
  • Add new test case when deployment strategy is pod identity. This mode only checks for basic Container Insight metrics that are present on any EKS cluster.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

@zhihonl zhihonl requested a review from a team as a code owner January 16, 2025 16:23
@@ -145,11 +145,30 @@ func validateMetricsAvailability(dims string, expected []string, actual map[stri
func compareMetrics(expected []string, actual map[string][][]types.Dimension) bool {
if len(expected) != len(actual) {
log.Printf("the count of fetched metrics do not match with expected count: expected-%v, actual-%v\n", len(expected), len(actual))

expectedSet := make(map[string]struct{})
Copy link
Contributor Author

@zhihonl zhihonl Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added for debugging purpose and I think would be nice to keep. Currently if a metric_value_benchmark test fail for EKS scenario, it just tells us what tests failed but not about which metrics were missing according to the expected output.

@@ -31,7 +32,9 @@ type EKSDaemonTestRunner struct {

func (e *EKSDaemonTestRunner) Validate() status.TestGroupResult {
var testResults []status.TestResult
testResults = append(testResults, metric.ValidateMetrics(e.env, "", eks_resources.GetExpectedDimsToMetrics(e.env))...)
if e.env.EksDeploymentStrategy != eksdeploymenttype.PODIDENTITY {
Copy link
Contributor Author

@zhihonl zhihonl Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more context, Container Insights provides different set of metrics based on what's present on the cluster. When testing Pod Identity I see a different set of metrics from this list:

var ExpectedDimsToMetrics = map[string][]string{

So instead of creating a new set of expected metrics list that can be hundreds of lines long, this solution is simpler since we are just looking for a sanity check to see if some metrics are present which means the credential is working

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more context, Container Insights provides different set of metrics based on what's present on the cluster. When testing Pod Identity I see a different set of metrics from this list:

Is that a problem? What is the diff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem, CI only shows certain metrics under certain situations. For example some error metrics are only shown when the failure case occur for pods. iirc for pod identity I saw additional API server related metrics being there, I'm guessing it's because of API interactions from pod identity addon.

varunch77
varunch77 previously approved these changes Jan 21, 2025
REPLICA EKSDeploymentType = "REPLICA"
SIDECAR EKSDeploymentType = "SIDECAR"
STATEFUL EKSDeploymentType = "STATEFUL"
PODIDENTITY EKSDeploymentType = "PODIDENTITY"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems kinda interesting to put PodIdentity here...it's not necessarily a deployment type but a credential provider. Technically we can have podidentity enabled with any of these deployment options right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I added here just as an easy way to have a different way of running EKS tests with basics without adding another variable in the class which felt redundant.

// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: MIT

module "common" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious -- is there a way we can re-use existing terraform modules from our other eks tests? So that we aren't adding all this terraform code for every new eks test we add

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should be possible but probably need to refactor some code. Don't think we should do it in this PR since that adds additional code that's not necessarily related to the goal of this PR.

type = "ingress"
}

resource "null_resource" "clone_helm_chart" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are using helm over the eks addon?

Wondering what it is this integ test is testing -- I would assume we would want to install the addon + pod identity enabled + the release artifact of the agent = verify functionality works / calls to cw work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used helm chart for easier testing. For example, I can check out a new branch from helm chart repo with custom changes, then run terraform on my local with the said changes. Using add-on have the same effect but will be harder to test custom changes.


provisioner "local-exec" {
command = <<-EOT
echo "Validating CloudWatch Agent and FluentBit with pod identity credential"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically...we are testing against cloudwatch agent, not fluentbit. Or are we pulling the latest fluentbit version in this test?

And if we are...I don't think we should block our agent release if fluentbit fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in new commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/aws/amazon-cloudwatch-agent/actions/runs/12952146544 new integration test run. random surprise that all EKS tests passing

Alt Text

@@ -31,7 +32,9 @@ type EKSDaemonTestRunner struct {

func (e *EKSDaemonTestRunner) Validate() status.TestGroupResult {
var testResults []status.TestResult
testResults = append(testResults, metric.ValidateMetrics(e.env, "", eks_resources.GetExpectedDimsToMetrics(e.env))...)
if e.env.EksDeploymentStrategy != eksdeploymenttype.PODIDENTITY {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more context, Container Insights provides different set of metrics based on what's present on the cluster. When testing Pod Identity I see a different set of metrics from this list:

Is that a problem? What is the diff?

@zhihonl zhihonl merged commit 70baaf0 into main Jan 24, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants