Skip to content

Commit

Permalink
Generate a temp certificate for OCP4 Trusted CA remediation
Browse files Browse the repository at this point in the history
Lately, we've been experiencing issues with manual remediations timing
out during functional testing. This manifests in the following error:

   === RUN   TestE2e/Apply_manual_remediations
    <snip>
    helpers.go:1225: Running manual remediation '/tmp/content-3345141771/applications/openshift/networking/default_ingress_ca_replaced/tests/ocp4/e2e-remediation.sh'
    helpers.go:1225: Running manual remediation '/tmp/content-3345141771/applications/openshift/general/file_integrity_notification_enabled/tests/ocp4/e2e-remediation.sh'
    helpers.go:1231: Command '/tmp/content-3345141771/applications/openshift/authentication/idp_is_configured/tests/ocp4/e2e-remediation.sh' timed out

In this particular case, it looks like the remediation to add an
Identity Provider to the cluster failed, but this is actually an
unintended side-effect of another change that updated the
idp_is_configured remediation to use a more robust technique for
determining if the cluster applied the remediation successfully:

  #12120
  #12184

Because we updated the remediation to use `oc adm
wait-for-stable-cluster`, we're effectively checking all cluster
operators to ensure they're healthy.

This started causing timeouts because a separate, unrelated remediation
was also getting applied in our testing that updated the default CA, but
didn't include a ConfigMap that contained the CA bundle. As a result,
one of the operators didn't come up because it was looking for a
ConfigMap that didn't exist. The `oc adm wait-for-stable-cluster`
command was hanging on a legitimate issue in a separate remediation.

This commit attempts to fix that issue by updating the trusted CA
remediation by creating a configmap for the expected certificate bundle.
  • Loading branch information
rhmdnd committed Jul 30, 2024
1 parent d9086f6 commit 3b73daa
Showing 1 changed file with 22 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -1,4 +1,25 @@
#!/bin/bash

# we are using an existing default secret name for testing, so it won't cause any subsequent failures on testing routes.
# Reuse an existing certificate so we don't have to regenerate one.
BUNDLE=$(oc get configmap -n openshift-apiserver trusted-ca-bundle -o json | jq '.data."ca-bundle.crt"')

# Create it in the openshift-config namespace. If we don't do this, the
# machineconfig cluster operator will fail to find it and eventually go into a
# degraded state.
cat << EOF | oc create -f -
apiVersion: v1
kind: ConfigMap
data:
ca-bundle.crt: $BUNDLE
metadata:
name: trusted-ca-bundle
namespace: openshift-config
EOF

# Update the trustedCA to point to the new configmap. This is effectively the
# remediation we're checking for.
oc patch proxies.config cluster --type merge -p '{"spec":{"trustedCA":{"name":"trusted-ca-bundle"}}}'

# This will bounce a bunch of the clusteroperators. Let's make sure they're all
# stable for a couple of minutes before moving on.
oc adm wait-for-stable-cluster --minimum-stable-period 2m

0 comments on commit 3b73daa

Please sign in to comment.