Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS auto mode with ComputeConfig set yields terraform error #1597

Closed
baughj opened this issue Jan 24, 2025 · 8 comments · Fixed by #1603
Closed

EKS auto mode with ComputeConfig set yields terraform error #1597

baughj opened this issue Jan 24, 2025 · 8 comments · Fixed by #1603
Assignees
Labels
kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed

Comments

@baughj
Copy link

baughj commented Jan 24, 2025

What happened?

Attempting to set ComputeConfig within auto mode configuration to enable a custom node role yields (what appears to be) a Terraform error on pulumi up:

Diagnostics:
  aws:eks:Cluster (eks-cluster-eksCluster):
    error:   sdk-v2/provider2.go:515: sdk.helper_schema: compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false: [email protected]
    error: Preview failed: diffing urn:pulumi:dev::eks-repro::eks:index:Cluster$aws:eks/cluster:Cluster::eks-cluster-eksCluster: 1 error occurred:
        * compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false

Example

This is the smallest repro that will yield the error:

using Pulumi;
using Pulumi.Aws.Iam;
using Pulumi.Eks;
using System.Collections.Generic;
using Pulumi.Eks.Inputs;

return await Deployment.RunAsync(() =>
{

    var assumePolicy =
        """
        {
          "Version": "2012-10-17",
          "Statement": [ 
            { 
              "Effect": "Allow",
              "Principal": { 
                "Service": "ec2.amazonaws.com"
              },
              "Action": "sts:AssumeRole"
             }
             ]
        }
        """;

    var nodeRole = new Role("eks-node-role", new RoleArgs
    {
        Name = "eks-node-role",
        AssumeRolePolicy = assumePolicy,
        Description = "EKS node role",
        ManagedPolicyArns =
        [
            "arn:aws:iam::aws:policy/AmazonEKSWorkerNodeMinimalPolicy",
            "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly"
        ]
    });

    var cluster = new Cluster("eks-cluster", new ClusterArgs
    {
        AuthenticationMode = AuthenticationMode.Api,
        AutoMode = new AutoModeOptionsArgs
        {
            Enabled = true,
            CreateNodeRole = false,
            ComputeConfig = new ClusterComputeConfigArgs { NodeRoleArn = nodeRole.Arn }
        }
    });
});

On pulumi up, this gives the error above.

Output of pulumi about

running 'dotnet build -nologo .'
  Determining projects to restore...

  All projects are up-to-date for restore.

  eks-repro -> /repos/eks-repro/bin/Debug/net8.0/eks-repro.dll



Build succeeded.

    0 Warning(s)
    0 Error(s)


Time Elapsed 00:00:02.80

'dotnet build -nologo .' completed successfully
CLI
Version      3.132.0
Go Version   go1.23.1
Go Compiler  gc

Plugins
KIND      NAME        VERSION
resource  aws         6.67.0
language  dotnet      unknown
resource  eks         3.8.0
resource  kubernetes  4.21.0

Host
OS       ubuntu
Version  22.04
Arch     x86_64

This project is written in dotnet: executable='/usr/bin/dotnet' version='8.0.112'

Current Stack: <redacted>

Found no resources associated with dev

Found no pending operations associated with dev

Backend
Name           pulumi.com
URL            <redacted>
User           <redacted>
Organizations  <redacted>
Token type     personal

Dependencies:
NAME        VERSION
Pulumi      3.71.1
Pulumi.Aws  6.67.0
Pulumi.Eks  3.8.0

### Additional context

_No response_

### Contributing

Vote on this issue by adding a 👍 reaction. 
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already). 
@baughj baughj added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Jan 24, 2025
@smithrobs
Copy link

I can repro this - pulumi preview fails with the error. If preview is skipped ala pulumi up -yf the cluster is created. I assume this is some sort of read issue which would affect diff'ing and possibly related to #1585

@flostadler
Copy link
Contributor

I can also repro in TS:

import * as pulumi from "@pulumi/pulumi";
import * as awsx from "@pulumi/awsx";
import * as eks from "@pulumi/eks";
import * as aws from "@pulumi/aws";

// Grab some values from the Pulumi configuration (or use default values)
const config = new pulumi.Config();
const vpcNetworkCidr = config.get("vpcNetworkCidr") || "10.0.0.0/16";

// Create a new VPC
const eksVpc = new awsx.ec2.Vpc("eks-vpc", {
    enableDnsHostnames: true,
    cidrBlock: vpcNetworkCidr,
    numberOfAvailabilityZones: 2,
    subnetStrategy: "Auto",
});

const nodeRole = new aws.iam.Role("eks-node-role", {
    assumeRolePolicy: aws.iam.getPolicyDocumentOutput({
        version: "2012-10-17",
        statements: [{
            effect: "Allow",
            principals: [{
                type: "Service",
                identifiers: ["ec2.amazonaws.com"]
            }],
            actions: ["sts:AssumeRole", "sts:TagSession"]
        }]
    }).json
});

const attachments = [
    new aws.iam.RolePolicyAttachment("eks-node-role-policy-worker-node-minimal", {
        role: nodeRole,
        policyArn: "arn:aws:iam::aws:policy/AmazonEKSWorkerNodeMinimalPolicy",
    }),
    new aws.iam.RolePolicyAttachment("eks-node-role-policy-ecr-pull", {
        role: nodeRole,
        policyArn: "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly",
    }),
];

// Create the EKS cluster
const eksCluster = new eks.Cluster("eks-cluster", {
    vpcId: eksVpc.vpcId,
    authenticationMode: eks.AuthenticationMode.Api,
    publicSubnetIds: eksVpc.publicSubnetIds,
    privateSubnetIds: eksVpc.privateSubnetIds,
    autoMode: {
        enabled: true,
        createNodeRole: false,
        computeConfig: {
            nodeRoleArn: nodeRole.arn,
        }
    }
}, { dependsOn: [...attachments] });

// Export some values for use elsewhere
export const kubeconfig = eksCluster.kubeconfig;
export const clusterName = eksCluster.eksCluster.name;

I'm currently digging into it, but while debugging I see that we're setting all necessary options correctly. @smithrobs is right, this is related to #1585. Which is caused by an issue in the upstream terraform provider (hashicorp/terraform-provider-aws#40582). Right now the logic for enabling/disabling auto mode is quite broken. We'll have a look if we can improve this in any way on the Pulumi side or fixing it upstream is the only option

@flostadler
Copy link
Contributor

The problem seems to be related to the computeConfig property of the underlying aws.eks.Cluster resource being unknown at preview time. This is expected, because the nodeRoleArn will be unknown until the role is created. Trying to figure out now why this trips up the upstream provider. It uses a diffCustomizer for the EKS cluster resource, so there might be a bug in there.

GRPC log:

{
    "method": "/pulumirpc.ResourceProvider/Create",
    "request": {
        "urn": "urn:pulumi:auto-mode-role::auto-mode-role::eks:index:Cluster$aws:eks/cluster:Cluster::eks-cluster-eksCluster",
        "properties": {
            "__defaults": [
                "name"
            ],
            "accessConfig": {
                "__defaults": [],
                "authenticationMode": "API",
                "bootstrapClusterCreatorAdminPermissions": true
            },
            "bootstrapSelfManagedAddons": false,
            "computeConfig": "04da6b54-80e4-46f7-96ec-b56ff0331ba9",
            "kubernetesNetworkConfig": {
                "__defaults": [],
                "elasticLoadBalancing": {
                    "__defaults": [],
                    "enabled": true
                },
                "ipFamily": "ipv4"
            },
            "name": "eks-cluster-eksCluster-2d66eb9",
            "roleArn": "04da6b54-80e4-46f7-96ec-b56ff0331ba9",
            "storageConfig": {
                "__defaults": [],
                "blockStorage": {
                    "__defaults": [],
                    "enabled": true
                }
            },
            "tags": {
                "Name": "eks-cluster-eksCluster"
            },
            "tagsAll": {
                "Name": "eks-cluster-eksCluster"
            },
            "vpcConfig": {
                "__defaults": [
                    "endpointPrivateAccess",
                    "endpointPublicAccess"
                ],
                "endpointPrivateAccess": false,
                "endpointPublicAccess": true,
                "subnetIds": "04da6b54-80e4-46f7-96ec-b56ff0331ba9"
            }
        },
        "preview": true,
        "name": "eks-cluster-eksCluster",
        "type": "aws:eks/cluster:Cluster"
    },
    "errors": [
        "rpc error: code = Unknown desc = diffing urn:pulumi:auto-mode-role::auto-mode-role::eks:index:Cluster$aws:eks/cluster:Cluster::eks-cluster-eksCluster: 1 error occurred:\n\t* compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false\n\n"
    ],
    "metadata": {
        "kind": "resource",
        "mode": "client",
        "name": "aws"
    }
}

@flostadler flostadler removed the needs-triage Needs attention from the triage team label Jan 27, 2025
@flostadler
Copy link
Contributor

Yeah, the diff customizer is not taking possibly unknown values into account: https://github.com/hashicorp/terraform-provider-aws/blob/ae93494f39ba70fe442e891caf05f8df21bde1ac/internal/service/eks/cluster.go#L1776-L1791

I was also able to reproduce in TF (see pulumi/pulumi-aws#5105 (comment)).

We could be able to work around this in the eks provider by making computeConfig, kubernetesNetworkConfig.elasticLoadBalancing and storageConfig.blockStorage depend on the same inputs to make sure they're either all known or all unknown. I'll give that a try.

@flostadler
Copy link
Contributor

I was able to create a hotfix for this in the EKS provider: #1603. The diff and update behavior of the upstream provider needs some more involved fixes for auto mode going forward (hashicorp/terraform-provider-aws#40582).

@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Jan 29, 2025
@flostadler
Copy link
Contributor

The fix was just merged in. The release should go out promptly

@baughj
Copy link
Author

baughj commented Jan 29, 2025

@flostadler thank you very much!

@pulumi-bot
Copy link
Contributor

This issue has been addressed in PR #1603 and shipped in release v3.8.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants