Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: close of nil channel during shutdown #5104

Closed
tstraley opened this issue Feb 15, 2024 · 6 comments · Fixed by #5175
Closed

panic: close of nil channel during shutdown #5104

tstraley opened this issue Feb 15, 2024 · 6 comments · Fixed by #5175
Assignees
Labels
backlog Pull requests/issues that are backlog items bug An issue reporting a potential bug

Comments

@tstraley
Copy link

tstraley commented Feb 15, 2024

Describe the bug
There is a Go panic being raised in the Controller during shutdown:


2024/01/23 00:27:05 [notice] 14333#14333: gracefully shutting down
2024/01/23 00:27:05 [notice] 14333#14333: exiting
2024/01/23 00:27:05 [notice] 14333#14333: exit
2024/01/23 00:27:05 [notice] 14332#14332: gracefully shutting down
2024/01/23 00:27:05 [notice] 14332#14332: exiting
2024/01/23 00:27:05 [notice] 14332#14332: exit
2024/01/23 00:27:05 [notice] 14#14: signal 17 (SIGCHLD) received from 14333
2024/01/23 00:27:05 [notice] 14#14: worker process 14333 exited with code 0
2024/01/23 00:27:05 [notice] 14#14: signal 29 (SIGIO) received
2024/01/23 00:27:05 [notice] 14#14: signal 17 (SIGCHLD) received from 14332
2024/01/23 00:27:05 [notice] 14#14: worker process 14332 exited with code 0
2024/01/23 00:27:05 [notice] 14#14: signal 29 (SIGIO) received
I0123 00:27:05.803797       1 main.go:508] Received SIGTERM, shutting down
I0123 00:27:05.804663       1 main.go:511] Shutting down the controller
panic: close of nil channel

To Reproduce
Steps to reproduce the behavior:

  1. Delete a running ingress controller pod (or encounter any normal rescheduling / scaling event that leads to pod termination)

Expected behavior
Ingress Controller should not panic.

Your environment

  • Version of the Ingress Controller - release version or a specific commit
    • Seen in v3.3.2 and v3.4.2
  • Version of Kubernetes
    • 1.28
  • Kubernetes platform (e.g. Mini-kube or GCP)
    • AWS EKS
  • Using NGINX or NGINX Plus
    • Seen in both OSS and NGINX Plus

Additional context
This appears to be 100% consistent on every shutdown, but only occurs after things look to have gracefully shut down, which means this is unlikely to be causing any issues.

Copy link

Hi @tstraley thanks for reporting!

Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this 🙂

Cheers!

@j1m-ryan
Copy link

j1m-ryan commented Feb 16, 2024

Hi @tstraley, I have not been able to replicate this.

I installed v3.4.2 on AWS EKS with helm, deployed the example in examples/custom-resources/basic-configuration then watched the NIC pod as I deleted it. Below are the logs from when I ran the delete command.

kubectl delete pod my-release-nginx-ingress-controller-6b947d4495-rkj9w
pod "my-release-nginx-ingress-controller-6b947d4495-rkj9w" deleted
I0216 14:50:19.063004       1 main.go:543] Received SIGTERM, shutting down
I0216 14:50:19.063025       1 main.go:546] Shutting down the controller
I0216 14:50:19.063061       1 main.go:550] Shutting down NGINX
I0216 14:50:19.063403       1 main.go:224] Waiting for the controller to exit...
2024/02/16 14:50:19 [notice] 12#12: signal 3 (SIGQUIT) received from 24, shutting down
2024/02/16 14:50:19 [notice] 20#20: gracefully shutting down
2024/02/16 14:50:19 [notice] 21#21: gracefully shutting down
2024/02/16 14:50:19 [notice] 21#21: exiting
2024/02/16 14:50:19 [notice] 20#20: exiting
2024/02/16 14:50:19 [notice] 20#20: exit
2024/02/16 14:50:19 [notice] 21#21: exit
2024/02/16 14:50:19 [notice] 12#12: signal 17 (SIGCHLD) received from 21
2024/02/16 14:50:19 [notice] 12#12: worker process 21 exited with code 0
2024/02/16 14:50:19 [notice] 12#12: signal 29 (SIGIO) received
2024/02/16 14:50:19 [notice] 12#12: signal 17 (SIGCHLD) received from 20
2024/02/16 14:50:19 [notice] 12#12: worker process 20 exited with code 0
2024/02/16 14:50:19 [notice] 12#12: exit
I0216 14:50:19.120423       1 main.go:556] Exiting with a status: 0

This particular line
I0216 14:50:19.120423 1 main.go:556] Exiting with a status: 0
Suggests that I'm getting to the OS.Exit(0) line in the handleTermination function.

I also tried on minikube and got the same result as I did on aws.

I do wonder what is causing this though. Could you provide any specific configurations or steps unique to your environment that might help us reproduce this issue?

@j1m-ryan j1m-ryan added the waiting for response Waiting for author's response label Feb 16, 2024
@tstraley
Copy link
Author

I'm not sure what would be unique to our environment, but here are possibly relevant configuration details:

Output of describe deployment (including run args):

Name:                   api-nginx-ingress-controller
Namespace:              nginx-ingress
CreationTimestamp:      Fri, 26 Jan 2024 02:27:06 +0000
Labels:                 app.kubernetes.io/instance=api
                        app.kubernetes.io/managed-by=Helm
                        app.kubernetes.io/name=nginx-ingress
                        app.kubernetes.io/version=3.4.2
                        helm.sh/chart=nginx-ingress-1.1.2
Annotations:            deployment.kubernetes.io/revision: 2
                        meta.helm.sh/release-name: api
                        meta.helm.sh/release-namespace: nginx-ingress
Selector:               app.kubernetes.io/instance=api,app.kubernetes.io/name=nginx-ingress
Replicas:               3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app.kubernetes.io/instance=api
                    app.kubernetes.io/name=nginx-ingress
  Annotations:      prometheus.io/port: 9113
                    prometheus.io/scheme: http
                    prometheus.io/scrape: true
  Service Account:  api-nginx-ingress
  Containers:
   nginx-ingress:
    Image:       private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.4.2
    Ports:       80/TCP, 443/TCP, 9113/TCP, 8081/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      -nginx-plus=true
      -nginx-reload-timeout=60000
      -enable-app-protect=false
      -enable-app-protect-dos=false
      -nginx-configmaps=$(POD_NAMESPACE)/api-nginx-ingress
      -ingress-class=api
      -health-status=false
      -health-status-uri=/nginx-health
      -nginx-debug=false
      -v=1
      -nginx-status=true
      -nginx-status-port=8080
      -nginx-status-allow-cidrs=127.0.0.1
      -report-ingress-status
      -external-service=api-nginx-ingress-controller
      -enable-leader-election=true
      -leader-election-lock-name=api-ingress-leader
      -enable-prometheus-metrics=true
      -prometheus-metrics-listen-port=9113
      -prometheus-tls-secret=
      -enable-service-insight=false
      -service-insight-listen-port=9114
      -service-insight-tls-secret=
      -enable-custom-resources=true
      -enable-snippets=true
      -include-year=false
      -disable-ipv6=false
      -enable-tls-passthrough=false
      -enable-cert-manager=false
      -enable-oidc=false
      -enable-external-dns=true
      -default-http-listener-port=80
      -default-https-listener-port=443
      -ready-status=true
      -ready-status-port=8081
      -enable-latency-metrics=true
      -ssl-dynamic-reload=true
    Requests:
      cpu:      100m
      memory:   128Mi
    Readiness:  http-get http://:readiness-port/nginx-ready delay=0s timeout=1s period=1s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:             (v1:metadata.namespace)
      POD_NAME:                  (v1:metadata.name)
    Mounts:                     <none>
  Volumes:                      <none>
  Topology Spread Constraints:  kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=api,app.kubernetes.io/name=nginx-ingress
                                topology.kubernetes.io/zone:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=api,app.kubernetes.io/name=nginx-ingress

Configmap values:

data:
  default-server-return: "444"
  http2: "True"
  keepalive: "32"
  log-format: '{"remoteAddress":"$remote_addr","requestId":"$http_x_request_id","requestMethod":"$request_method","requestReferer":"$http_referer","requestSize":"$request_length","requestUrl":"$scheme://$host$request_uri","responseCode":"$status","responseSize":"$body_bytes_sent","responseTime":"$request_time","spanID":"$opentracing_context_x_b3_spanid","sslClientVerify":"$ssl_client_verify","traceID":"$opentracing_context_x_b3_traceid","ts":"$time_iso8601","upstreamCode":"$upstream_status","upstreamResponseTime":"$upstream_response_time","upstreamService":"$service","userAgent":"$http_user_agent"}'
  log-format-escaping: json
  opentracing: "True"
  opentracing-tracer: /usr/local/lib/libzipkin_opentracing_plugin.so
  opentracing-tracer-config: '{"collector_host":"grafana-k8s-monitoring-grafana-agent.monitoring.svc","sample_rate":1,"service_name":"nginx-ingress"}'
  proxy-connect-timeout: 3s
  redirect-to-https: "True"
  server-tokens: "False"
  ssl-protocols: TLSv1.2 TLSv1.3
  worker-connections: "1024"

We install and manage this with the helm chart and the available exposed values settings there.
We capture logs in grafana, so we can review these pod logs historically: We used to run the OSS version (not NGINX Plus) which had the same panic. We also had some other configuration differences a month ago, but I see no Exiting with a status logs in the historical logs from nginx-ingress, only the panic during each shutdown event.

Let me know if there is other information I can provide to help.

@j1m-ryan j1m-ryan added bug An issue reporting a potential bug and removed waiting for response Waiting for author's response labels Feb 21, 2024
@j1m-ryan
Copy link

I have replicated this with your config map and helm values on v3.4.2 on minikube. Thanks again for the detailed info @tstraley

image

configmap.yaml

kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-config
  namespace: nginx-ingress
data:
  default-server-return: "444"
  http2: "True"
  keepalive: "32"
  log-format: '{"remoteAddress":"$remote_addr","requestId":"$http_x_request_id","requestMethod":"$request_method","requestReferer":"$http_referer","requestSize":"$request_length","requestUrl":"$scheme://$host$request_uri","responseCode":"$status","responseSize":"$body_bytes_sent","responseTime":"$request_time","spanID":"$opentracing_context_x_b3_spanid","sslClientVerify":"$ssl_client_verify","traceID":"$opentracing_context_x_b3_traceid","ts":"$time_iso8601","upstreamCode":"$upstream_status","upstreamResponseTime":"$upstream_response_time","upstreamService":"$service","userAgent":"$http_user_agent"}'
  log-format-escaping: json
  opentracing: "True"
  opentracing-tracer: /usr/local/lib/libzipkin_opentracing_plugin.so
  opentracing-tracer-config: '{"collector_host":"grafana-k8s-monitoring-grafana-agent.monitoring.svc","sample_rate":1,"service_name":"nginx-ingress"}'
  proxy-connect-timeout: 3s
  redirect-to-https: "True"
  server-tokens: "False"
  ssl-protocols: TLSv1.2 TLSv1.3
  worker-connections: "1024"

values.yaml

controller:
  ## The name of the Ingress Controller daemonset or deployment.
  name: api-nginx-ingress-controller

  ## The kind of the Ingress Controller installation - deployment or daemonset.
  kind: deployment

  ## The selectorLabels used to override the default values.
  selectorLabels: {}

  ## Annotations for deployments and daemonsets
  annotations: {}

  ## Deploys the Ingress Controller for NGINX Plus.
  nginxplus: true 

  ## Timeout in milliseconds which the Ingress Controller will wait for a successful NGINX reload after a change or at the initial start.
  nginxReloadTimeout: 60000

  ## Support for App Protect WAF
  appprotect:
    ## Enable the App Protect WAF module in the Ingress Controller.
    enable: false
    ## Sets log level for App Protect WAF. Allowed values: fatal, error, warn, info, debug, trace
    # logLevel: fatal

  ## Support for App Protect DoS
  appprotectdos:
    ## Enable the App Protect DoS module in the Ingress Controller.
    enable: false
    ## Enable debugging for App Protect DoS.
    debug: false
    ## Max number of nginx processes to support.
    maxWorkers: 0
    ## Max number of ADMD instances.
    maxDaemons: 0
    ## RAM memory size to consume in MB.
    memory: 0

  ## Enables the Ingress Controller pods to use the host's network namespace.
  hostNetwork: false

  ## The hostPort configuration for the Ingress Controller pods.
  hostPort:
    ## Enables hostPort for the Ingress Controller pods.
    enable: false

    ## The HTTP hostPort configuration for the Ingress Controller pods.
    http: 80

    ## The HTTPS hostPort configuration for the Ingress Controller pods.
    https: 443

  containerPort:
    ## The HTTP containerPort configuration for the Ingress Controller pods.
    http: 80

    ## The HTTPS containerPort configuration for the Ingress Controller pods.
    https: 443

  ## DNS policy for the Ingress Controller pods
  dnsPolicy: ClusterFirst

  ## Enables debugging for NGINX. Uses the nginx-debug binary. Requires error-log-level: debug in the ConfigMap via `controller.config.entries`.
  nginxDebug: false

  ## Share process namespace between containers in the Ingress Controller pod.
  shareProcessNamespace: false

  ## The log level of the Ingress Controller.
  logLevel: 1

  ## A list of custom ports to expose on the NGINX Ingress Controller pod. Follows the conventional Kubernetes yaml syntax for container ports.
  customPorts: []

  image:
    ## The image repository of the Ingress Controller.
    repository: nginx/nginx-ingress

    ## The tag of the Ingress Controller image. If not specified the appVersion from Chart.yaml is used as a tag.
    tag: "3.4.2"
    ## The digest of the Ingress Controller image.
    ## If digest is specified it has precedence over tag and will be used instead
    # digest: "sha256:CHANGEME"

    ## The pull policy for the Ingress Controller image.
    pullPolicy: IfNotPresent

  ## The lifecycle of the Ingress Controller pods.
  lifecycle: {}

  ## The custom ConfigMap to use instead of the one provided by default
  customConfigMap: "nginx-config"

  config:
    ## The name of the ConfigMap used by the Ingress Controller.
    ## Autogenerated if not set or set to "".
    # name: nginx-config

    ## The annotations of the Ingress Controller configmap.
    annotations: {}

    ## The entries of the ConfigMap for customizing NGINX configuration.
    entries: {}

  ## It is recommended to use your own TLS certificates and keys
  defaultTLS:
    ## The base64-encoded TLS certificate for the default HTTPS server.
    ## Note: It is recommended that you specify your own certificate. Alternatively, omitting the default server secret completely will configure NGINX to reject TLS connections to the default server.
    cert: ""

    ## The base64-encoded TLS key for the default HTTPS server.
    ## Note: It is recommended that you specify your own key. Alternatively, omitting the default server secret completely will configure NGINX to reject TLS connections to the default server.
    key: ""

    ## The secret with a TLS certificate and key for the default HTTPS server.
    ## The value must follow the following format: `<namespace>/<name>`.
    ## Used as an alternative to specifying a certificate and key using `controller.defaultTLS.cert` and `controller.defaultTLS.key` parameters.
    ## Note: Alternatively, omitting the default server secret completely will configure NGINX to reject TLS connections to the default server.
    ## Format: <namespace>/<secret_name>
    secret: ""

  wildcardTLS:
    ## The base64-encoded TLS certificate for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## If the parameter is not set, for such Ingress/VirtualServer hosts NGINX will break any attempt to establish a TLS connection.
    cert: ""

    ## The base64-encoded TLS key for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## If the parameter is not set, for such Ingress/VirtualServer hosts NGINX will break any attempt to establish a TLS connection.
    key: ""

    ## The secret with a TLS certificate and key for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## The value must follow the following format: `<namespace>/<name>`.
    ## Used as an alternative to specifying a certificate and key using `controller.wildcardTLS.cert` and `controller.wildcardTLS.key` parameters.
    ## Format: <namespace>/<secret_name>
    secret: ""

  ## The node selector for pod assignment for the Ingress Controller pods.
  # nodeSelector: {}

  ## The termination grace period of the Ingress Controller pod.
  terminationGracePeriodSeconds: 30

  ## HorizontalPodAutoscaling (HPA)
  autoscaling:
    ## Enables HorizontalPodAutoscaling.
    enabled: false
    ## The annotations of the Ingress Controller HorizontalPodAutoscaler.
    annotations: {}
    ## Minimum number of replicas for the HPA.
    minReplicas: 1
    ## Maximum number of replicas for the HPA.
    maxReplicas: 3
    ## The target cpu utilization percentage.
    targetCPUUtilizationPercentage: 50
    ## The target memory utilization percentage.
    targetMemoryUtilizationPercentage: 50
    ## Custom behavior policies
    behavior: {}

  ## The resources of the Ingress Controller pods.
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
  # limits:
  #   cpu: 1
  #   memory: 1Gi

  ## The resources for the Ingress Controller init container which is used when readOnlyRootFilesystem is set to true.
  initContainerResources:
    requests:
      cpu: 100m
      memory: 128Mi
    # limits:
    #   cpu: 1
    #   memory: 1Gi

  ## The tolerations of the Ingress Controller pods.
  tolerations: []

  ## The affinity of the Ingress Controller pods.
  affinity: {}

  ## The topology spread constraints of the Ingress controller pods.
  # topologySpreadConstraints: {}

  ## The additional environment variables to be set on the Ingress Controller pods.
  env: []
  # - name: MY_VAR
  #   value: myvalue

  ## The volumes of the Ingress Controller pods.
  volumes: []
  # - name: extra-conf
  #   configMap:
  #     name: extra-conf

  ## The volumeMounts of the Ingress Controller pods.
  volumeMounts: []
  # - name: extra-conf
  #   mountPath: /etc/nginx/conf.d/extra.conf
  #   subPath: extra.conf

  ## InitContainers for the Ingress Controller pods.
  initContainers: []
  # - name: init-container
  #   image: busybox:1.34
  #   command: ['sh', '-c', 'echo this is initial setup!']

  ## The minimum number of seconds for which a newly created Pod should be ready without any of its containers crashing, for it to be considered available.
  minReadySeconds: 0

  ## Pod disruption budget for the Ingress Controller pods.
  podDisruptionBudget:
    ## Enables PodDisruptionBudget.
    enabled: false
    ## The annotations of the Ingress Controller pod disruption budget.
    annotations: {}
    ## The number of Ingress Controller pods that should be available. This is a mutually exclusive setting with "maxUnavailable".
    # minAvailable: 1
    ## The number of Ingress Controller pods that can be unavailable. This is a mutually exclusive setting with "minAvailable".
    # maxUnavailable: 1

    ## Strategy used to replace old Pods by new ones. .spec.strategy.type can be "Recreate" or "RollingUpdate" for Deployments, and "OnDelete" or "RollingUpdate" for Daemonsets. "RollingUpdate" is the default value.
  strategy: {}

  ## Extra containers for the Ingress Controller pods.
  extraContainers: []
  # - name: container
  #   image: busybox:1.34
  #   command: ['sh', '-c', 'echo this is a sidecar!']

  ## The number of replicas of the Ingress Controller deployment.
  replicaCount: 3

  ## Configures the ingress class the Ingress Controller uses.
  ingressClass:
    ## A class of the Ingress Controller.

    ## IngressClass resource with the name equal to the class must be deployed. Otherwise,
    ## the Ingress Controller will fail to start.
    ## The Ingress Controller only processes resources that belong to its class - i.e. have the "ingressClassName" field resource equal to the class.

    ## The Ingress Controller processes all the resources that do not have the "ingressClassName" field for all versions of kubernetes.
    name: nginx

    ## Creates a new IngressClass object with the name "controller.ingressClass.name". Set to false to use an existing IngressClass with the same name. If you use helm upgrade, do not change the values from the previous release as helm will delete IngressClass objects managed by helm. If you are upgrading from a release earlier than 3.3.0, do not set the value to false.
    create: true

    ## New Ingresses without an ingressClassName field specified will be assigned the class specified in `controller.ingressClass`. Requires "controller.ingressClass.create".
    setAsDefaultIngress: false

  ## Comma separated list of namespaces to watch for Ingress resources. By default the Ingress Controller watches all namespaces. Mutually exclusive with "controller.watchNamespaceLabel".
  watchNamespace: ""

  ## Configures the Ingress Controller to watch only those namespaces with label foo=bar. By default the Ingress Controller watches all namespaces. Mutually exclusive with "controller.watchNamespace".
  watchNamespaceLabel: ""

  ## Comma separated list of namespaces to watch for Secret resources. By default the Ingress Controller watches all namespaces.
  watchSecretNamespace: ""

  ## Enable the custom resources.
  enableCustomResources: true

  ## Enable OIDC policies.
  enableOIDC: false

  ## Include year in log header. This parameter will be removed in release 2.7 and the year will be included by default.
  includeYear: false

  ## Enable TLS Passthrough on port 443. Requires controller.enableCustomResources.
  enableTLSPassthrough: false

  ## Set the port for TLS Passthrough. Requires controller.enableCustomResources and controller.enableTLSPassthrough.
  tlsPassthroughPort: 443

  ## Enable cert manager for Virtual Server resources. Requires controller.enableCustomResources.
  enableCertManager: false

  ## Enable external DNS for Virtual Server resources. Requires controller.enableCustomResources.
  enableExternalDNS: true 

  globalConfiguration:
    ## Creates the GlobalConfiguration custom resource. Requires controller.enableCustomResources.
    create: false

    ## The spec of the GlobalConfiguration for defining the global configuration parameters of the Ingress Controller.
    spec: {} ## Ensure both curly brackets are removed when adding listeners in YAML format.
    # listeners:
    # - name: dns-udp
    #   port: 5353
    #   protocol: UDP
    # - name: dns-tcp
    #   port: 5353
    #   protocol: TCP

  ## Enable custom NGINX configuration snippets in Ingress, VirtualServer, VirtualServerRoute and TransportServer resources.
  enableSnippets: true 

  ## Add a location based on the value of health-status-uri to the default server. The location responds with the 200 status code for any request.
  ## Useful for external health-checking of the Ingress Controller.
  healthStatus: true 

  ## Sets the URI of health status location in the default server. Requires controller.healthStatus.
  healthStatusURI: "/nginx-health"

  nginxStatus:
    ## Enable the NGINX stub_status, or the NGINX Plus API.
    enable: true

    ## Set the port where the NGINX stub_status or the NGINX Plus API is exposed.
    port: 8080

    ## Add IPv4 IP/CIDR blocks to the allow list for NGINX stub_status or the NGINX Plus API. Separate multiple IP/CIDR by commas.
    allowCidrs: "127.0.0.1"

  service:
    ## Creates a service to expose the Ingress Controller pods.
    create: true

    ## The type of service to create for the Ingress Controller.
    type: LoadBalancer

    ## The externalTrafficPolicy of the service. The value Local preserves the client source IP.
    externalTrafficPolicy: Local

    ## The annotations of the Ingress Controller service.
    annotations: {}

    ## The extra labels of the service.
    extraLabels: {}

    ## The static IP address for the load balancer. Requires controller.service.type set to LoadBalancer. The cloud provider must support this feature.
    loadBalancerIP: ""

    ## The ClusterIP for the Ingress Controller service, autoassigned if not specified.
    clusterIP: ""

    ## The list of external IPs for the Ingress Controller service.
    externalIPs: []

    ## The IP ranges (CIDR) that are allowed to access the load balancer. Requires controller.service.type set to LoadBalancer. The cloud provider must support this feature.
    loadBalancerSourceRanges: []

    ## Whether to automatically allocate NodePorts (only for LoadBalancers).
    # allocateLoadBalancerNodePorts: false

    ## Dual stack preference.
    ## Valid values: SingleStack, PreferDualStack, RequireDualStack
    # ipFamilyPolicy: SingleStack

    ## List of IP families assigned to this service.
    ## Valid values: IPv4, IPv6
    # ipFamilies:
    #   - IPv6

    httpPort:
      ## Enables the HTTP port for the Ingress Controller service.
      enable: true

      ## The HTTP port of the Ingress Controller service.
      port: 80

      ## The custom NodePort for the HTTP port. Requires controller.service.type set to NodePort.
      # nodePort: 80

      ## The HTTP port on the POD where the Ingress Controller service is running.
      targetPort: 80

    httpsPort:
      ## Enables the HTTPS port for the Ingress Controller service.
      enable: true

      ## The HTTPS port of the Ingress Controller service.
      port: 443

      ## The custom NodePort for the HTTPS port. Requires controller.service.type set to NodePort.
      # nodePort: 443

      ## The HTTPS port on the POD where the Ingress Controller service is running.
      targetPort: 443

    ## A list of custom ports to expose through the Ingress Controller service. Follows the conventional Kubernetes yaml syntax for service ports.
    customPorts: []

  serviceAccount:
    ## The annotations of the service account of the Ingress Controller pods.
    annotations: {}

    ## The name of the service account of the Ingress Controller pods. Used for RBAC.
    ## Autogenerated if not set or set to "".
    name: api-nginx-ingress

    ## The name of the secret containing docker registry credentials.
    ## Secret must exist in the same namespace as the helm release.
    imagePullSecretName: ""

    ## A list of secret names containing docker registry credentials.
    ## Secrets must exist in the same namespace as the helm release.
    imagePullSecretsNames: []

  reportIngressStatus:
    ## Updates the address field in the status of Ingress resources with an external address of the Ingress Controller.
    ## You must also specify the source of the external address either through an external service via controller.reportIngressStatus.externalService,
    ## controller.reportIngressStatus.ingressLink or the external-status-address entry in the ConfigMap via controller.config.entries.
    ## Note: controller.config.entries.external-status-address takes precedence over the others.
    enable: true

    ## Specifies the name of the service with the type LoadBalancer through which the Ingress Controller is exposed externally.
    ## The external address of the service is used when reporting the status of Ingress, VirtualServer and VirtualServerRoute resources.
    ## controller.reportIngressStatus.enable must be set to true.
    ## The default is autogenerated and matches the created service (see controller.service.create).
    externalService: api-nginx-ingress-controller

    ## Specifies the name of the IngressLink resource, which exposes the Ingress Controller pods via a BIG-IP system.
    ## The IP of the BIG-IP system is used when reporting the status of Ingress, VirtualServer and VirtualServerRoute resources.
    ## controller.reportIngressStatus.enable must be set to true.
    ingressLink: ""

    ## Enable Leader election to avoid multiple replicas of the controller reporting the status of Ingress resources. controller.reportIngressStatus.enable must be set to true.
    enableLeaderElection: true

    ## Specifies the name to be used as the lock for leader election. controller.reportIngressStatus.enableLeaderElection must be set to true.
    leaderElectionLockName: "api-ingress-leader"

    ## The annotations of the leader election configmap.
    annotations: {}

  pod:
    ## The annotations of the Ingress Controller pod.
    annotations: {}

    ## The additional extra labels of the Ingress Controller pod.
    extraLabels: {}

  ## The PriorityClass of the Ingress Controller pods.
  # priorityClassName: ""

  readyStatus:
    ## Enables readiness endpoint "/nginx-ready". The endpoint returns a success code when NGINX has loaded all the config after startup.
    enable: true

    ## Set the port where the readiness endpoint is exposed.
    port: 8081

    ## The number of seconds after the Ingress Controller pod has started before readiness probes are initiated.
    initialDelaySeconds: 0

  ## Enable collection of latency metrics for upstreams. Requires prometheus.create.
  enableLatencyMetrics: false

  ## Disable IPV6 listeners explicitly for nodes that do not support the IPV6 stack.
  disableIPV6: false

  ## Sets the port for the HTTP `default_server` listener.
  defaultHTTPListenerPort: 80

  ## Sets the port for the HTTPS `default_server` listener.
  defaultHTTPSListenerPort: 443

  ## Configure root filesystem as read-only and add volumes for temporary data.
  readOnlyRootFilesystem: false

  ## Enable dynamic reloading of certificates
  enableSSLDynamicReload: true

  ## Enable telemetry reporting
  enableTelemetryReporting: true

rbac:
  ## Configures RBAC.
  create: true

prometheus:
  ## Expose NGINX or NGINX Plus metrics in the Prometheus format.
  create: true

  ## Configures the port to scrape the metrics.
  port: 9113

  ## Specifies the namespace/name of a Kubernetes TLS Secret which will be used to protect the Prometheus endpoint.
  secret: ""

  ## Configures the HTTP scheme used.
  scheme: http

  service:
    ## Creates a ClusterIP Service to expose Prometheus metrics internally
    ## Requires prometheus.create=true
    create: false

    labels:
      service: "nginx-ingress-prometheus-service"

  serviceMonitor:
    ## Creates a serviceMonitor to expose statistics on the kubernetes pods.
    create: false

    ## Kubernetes object labels to attach to the serviceMonitor object.
    labels: {}

    ## A set of labels to allow the selection of endpoints for the ServiceMonitor.
    selectorMatchLabels:
      service: "nginx-ingress-prometheus-service"

    ## A list of endpoints allowed as part of this ServiceMonitor.
    ## Matches on the name of a Service port.
    endpoints:
      - port: prometheus

serviceInsight:
  ## Expose NGINX Plus Service Insight endpoint.
  create: false

  ## Configures the port to expose endpoint.
  port: 9114

  ## Specifies the namespace/name of a Kubernetes TLS Secret which will be used to protect the Service Insight endpoint.
  secret: ""

  ## Configures the HTTP scheme used.
  scheme: http

nginxServiceMesh:
  ## Enables integration with NGINX Service Mesh.
  enable: false

  ## Enables NGINX Service Mesh workload to route egress traffic through the Ingress Controller.
  ## Requires nginxServiceMesh.enable
  enableEgress: false

@j1m-ryan
Copy link

I have isolated this panic to the enableExternalDNS: true helm chart value. This also happens on v3.5.0-snapshot.

@shaun-nx shaun-nx added the backlog Pull requests/issues that are backlog items label Feb 26, 2024
@shaun-nx shaun-nx self-assigned this Feb 26, 2024
@shaun-nx
Copy link
Contributor

shaun-nx commented Feb 27, 2024

@tstraley @j1m-ryan
I've been debugging this a bit more today.
I looked at how our Leader Election processes exits, as that seems to stop just fine without causing this panic.

I found this line, which will be called in the event of a panic

defer utilruntime.HandleCrash()

Docs for HandleCrash

If we add this line to the Run() function for ExternalDNS, we get this output when the panic happens:

E0227 15:30:37.284954       1 runtime.go:79] Observed a panic: "close of nil channel" (close of nil channel)
goroutine 1942 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1dd8a60?, 0x256a970})
	k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x90?})
	k8s.io/[email protected]/pkg/util/runtime/runtime.go:49 +0x6b
panic({0x1dd8a60?, 0x256a970?})
	runtime/panic.go:914 +0x21f
github.com/nginxinc/kubernetes-ingress/internal/externaldns.(*namespacedInformer).stop(...)
	github.com/nginxinc/kubernetes-ingress/internal/externaldns/controller.go:149
github.com/nginxinc/kubernetes-ingress/internal/externaldns.(*ExtDNSController).Run(0xc000788300, 0x0?)
	github.com/nginxinc/kubernetes-ingress/internal/externaldns/controller.go:139 +0x45e
created by github.com/nginxinc/kubernetes-ingress/internal/k8s.(*LoadBalancerController).Run in goroutine 1
	github.com/nginxinc/kubernetes-ingress/internal/k8s/controller.go:707 +0x489
panic: close of nil channel [recovered]
	panic: close of nil channel

It looks like for ExternalDNS, the stopCh for each namespaced informer aren't initilized.
I tested with enableCertManager=true and enableExternalDns=false and I didn't encounter the panic
If we look at the CertManager process, for example, the channel is created correctly:
https://github.com/nginxinc/kubernetes-ingress/blob/3b14d1d09e7cd07ea769fdaa968bab68b7b7319e/internal/certmanager/cm_controller.go#L111-L113

I've opened a PR here for the fix: #5175

@shaun-nx shaun-nx linked a pull request Feb 27, 2024 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Pull requests/issues that are backlog items bug An issue reporting a potential bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants