Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] (release-manager) release into QA/TEST environment breaks with helm secret and age encryption #1185

Open
SimonGolms opened this issue Dec 20, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@SimonGolms
Copy link
Contributor

Describe the bug
During the rollout from DEV to TEST the rollout with helm and the plugin helm-secrets with the default and recommended used encryption method age is not supported and fails.
helm-secrets uses the tool sops to encrypt and decrypt the secrets. sops supports various encryption methods such as age and gpg. sops recommends using age instead of gpg.

To Reproduce
Steps to reproduce the behavior:

  1. Follow the documentation to setup helm-secrets with age encryption and set the environment variables SOPS_AGE_KEY and SOPS_AGE_RECIPIENTS, which are used by sops for encryption and decryption.

  2. Add/Edit the secret(.dev|test|prod).yaml file and encrypt your secrets with age

    $ helm secrets encrypt -i ./chart/secrets.yaml
  3. Provide the SOPS_AGE_KEY to Jenkins

    $ oc create secret generic helm-sops-age-key --from-literal=secrettext=$SOPS_AGE_KEY
    
    $ oc label secret helm-sops-age-key credential.sync.jenkins.openshift.io=true
  4. Update Jenkins to use SOPS_AGE_KEY in the ODS Rollout Stage

    // Jenkinsfile
    withCredentials([
       string(
         credentialsId: "YOUR_PROJECT_KEY-cd-helm-sops-age-key",
         variable: 'SOPS_AGE_KEY'
       )
    ]) {
       odsComponentStageRolloutOpenShiftDeployment(context, [
          helmEnvBasedValuesFiles: ["values.env.yaml","secrets.env.yaml"],
          helmValuesFiles: ["values.yaml", "secrets.yaml"],
       ])
    }
  5. Release a new version of your application into DEV - this should work

  6. Release the same version of your application into TEST - this will fail

  7. See Log Output

Expected behavior
Full support of the helm-secret plugin with the default and recommended used encryption method age

Screenshots
If applicable, add screenshots to help explain your problem.

Affected version (please complete the following information):

  • OpenShift: 4.x
  • OpenDevStack 4.x

Log Output (ensure to remove any confidential information like tokens, project names, etc.

...
DEBUG: Release Manager Build Parameters: [changeDescription:UNDEFINED, changeId:UNDEFINED, configItem:UNDEFINED, releaseStatusJiraIssueKey:null, targetEnvironment:dev, targetEnvironmentToken:D, version:20241220.004, rePromote:false]
...
Rolling out YOUR_COMPONENT_ID with HELM, selector: app.kubernetes.io/instance=YOUR_COMPONENT_ID
[Pipeline] dir
Running in /tmp/workspace/YOUR_PROJECT_ID-cd/YOUR_PROJECT_ID-cd-releasemanager-master/chart
[Pipeline] {
[Pipeline] withCredentials
[Pipeline] // withCredentials
[Pipeline] sh (Show diff explaining what helm upgrade would change for release YOUR_COMPONENT_ID in YOUR_PROJECT_ID-dev)
+ HELM_DIFF_IGNORE_UNKNOWN_FLAGS=true
+ helm -n YOUR_PROJECT_ID-dev secrets diff upgrade --install --atomic -f values.yaml -f secrets.yaml -f values.dev.yaml -f secrets.dev.yaml --set registry=image-registry.openshift-image-registry.svc:5000 --set componentId=YOUR_COMPONENT_ID --set global.registry=image-registry.openshift-image-registry.svc:5000 --set global.componentId=YOUR_COMPONENT_ID --set imageNamespace=YOUR_PROJECT_ID-dev --set imageTag=fa229a97 --set global.imageNamespace=YOUR_PROJECT_ID-dev --set global.imageTag=fa229a97 --no-color --three-way-merge --normalize-manifests YOUR_COMPONENT_ID ./
[helm-secrets] Decrypt: secrets.yaml
[helm-secrets] Decrypt: secrets.dev.yaml

[helm-secrets] Removed: secrets.yaml.dec
[helm-secrets] Removed: secrets.dev.yaml.dec
[Pipeline] sh (Upgrade Helm release YOUR_COMPONENT_ID in YOUR_PROJECT_ID-dev)
+ helm -n YOUR_PROJECT_ID-dev secrets upgrade --install --atomic -f values.yaml -f secrets.yaml -f values.dev.yaml -f secrets.dev.yaml --set registry=image-registry.openshift-image-registry.svc:5000 --set componentId=YOUR_COMPONENT_ID --set global.registry=image-registry.openshift-image-registry.svc:5000 --set global.componentId=YOUR_COMPONENT_ID --set imageNamespace=YOUR_PROJECT_ID-dev --set imageTag=fa229a97 --set global.imageNamespace=YOUR_PROJECT_ID-dev --set global.imageTag=fa229a97 YOUR_COMPONENT_ID ./
[helm-secrets] Decrypt: secrets.yaml
[helm-secrets] Decrypt: secrets.dev.yaml
Release "YOUR_COMPONENT_ID" has been upgraded. Happy Helming!
NAME: YOUR_COMPONENT_ID
LAST DEPLOYED: Fri Dec 20 08:22:54 2024
NAMESPACE: YOUR_PROJECT_ID-dev
STATUS: deployed
REVISION: 8
NOTES:
1. Get the application URL by running these commands:
  http://YOUR_PROJECT_ID-dev.apps.host.com/

[helm-secrets] Removed: secrets.yaml.dec
[helm-secrets] Removed: secrets.dev.yaml.dec
[Pipeline] }
[Pipeline] // dir
[Pipeline] sh (Getting all Deployment,DeploymentConfig names for selector 'app.kubernetes.io/instance=use-secrets-w)
+ oc -n YOUR_PROJECT_ID-dev get Deployment,DeploymentConfig -l app.kubernetes.io/instance=YOUR_COMPONENT_ID -o 'template={{range .items}}{{.kind}}:{{.metadata.name}} {{end}}'
Warning: apps.openshift.io/v1 DeploymentConfig is deprecated in v4.14+, unavailable in v4.10000+
[Pipeline] echo
org.ods.component.HelmDeploymentStrategy -- DEPLOYMENT RESOURCES
[Pipeline] echo
{
    "Deployment": [
        "YOUR_COMPONENT_ID-helm-secrets"
    ]
}
...
...
DEBUG: Release Manager Build Parameters: [changeDescription:UNDEFINED, changeId:UNDEFINED, configItem:UNDEFINED, releaseStatusJiraIssueKey:null, targetEnvironment:qa, targetEnvironmentToken:D, version:20241220.004, rePromote:false]
...
Applying desired OpenShift state defined in [email protected] to YOUR_PROJECT_KEY-test, deploymentMean? [type:helm, selector:app.kubernetes.io/instance=YOUR_COMPONENT_ID, chartDir:chart, helmReleaseName:YOUR_COMPONENT_ID, helmEnvBasedValuesFiles:[values.env.yaml, secrets.env.yaml], helmValuesFiles:[values.yaml, secrets.yaml], helmValues:[registry:image-registry.openshift-image-registry.svc:5000, componentId:YOUR_COMPONENT_ID], helmDefaultFlags:[--install, --atomic], helmAdditionalFlags:[], repoId:YOUR_COMPONENT_ID]
[Pipeline] withCredentials
[Pipeline] // withCredentials
[Pipeline] sh (Show diff explaining what helm upgrade would change for release YOUR_COMPONENT_ID in YOUR_PROJECT_KEY-test)
+ HELM_DIFF_IGNORE_UNKNOWN_FLAGS=true
+ helm -n YOUR_PROJECT_KEY-test secrets diff upgrade --install --atomic -f values.yaml -f secrets.yaml -f values.test.yaml -f secrets.test.yaml --set imageTag=ods-generated-v20241220.004-UNDEFINED-0b61-Q --set imageNamespace=YOUR_PROJECT_KEY-test --set componentId=YOUR_COMPONENT_ID --set global.imageTag=ods-generated-v20241220.004-UNDEFINED-0b61-Q --set global.imageNamespace=YOUR_PROJECT_KEY-test --set global.componentId=YOUR_COMPONENT_ID --set registry=image-registry.openshift-image-registry.svc:5000 --no-color --three-way-merge --normalize-manifests YOUR_COMPONENT_ID ./
Failed to get the data key required to decrypt the SOPS file.

Group 0: FAILED
  age1v4786u********************************************hsn0n8cz: FAILED
    - | failed to load age identities: failed to open file: open
      | /home/jenkins/.config/sops/age/keys.txt: no such file or
      | directory

Recovery failed because no master key was able to decrypt the file. In
order for SOPS to recover the file, at least one key has to be successful,
but none were.
[helm-secrets] Error while decrypting file: secrets.yaml
Error: plugin "secrets" exited with error
[Pipeline] }
[Pipeline] // dir
[Pipeline] }
[Pipeline] // dir
[Pipeline] echo
WARN: script returned exit code 1
...

Additional context
The suspected bug is in the odsOrchestrationPipeline, because as I understand it, the Release Manager assumes that gpg is the only encryption method and, if available, the private.key is imported into the gpg keyring of the Jenkins agent during startup. However, this is not mentioned anywhere in the documentation and should therefore not be assumed. Especially because age is the default and recommended method by sops and easier to handle. In addition there are also incompatibilities between the gpg versions ((outdated) jenkins-agents vs. local machine) that are used for key generation, which leads to new and unforeseen errors during a release.

@SimonGolms SimonGolms added the bug Something isn't working label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant