Daniel's Blog

Upgrading Nginx Ingress Controller on GKE

Introduction

The application is running in GKE, but instead of using a Google Container Engine ingress controller it uses an Nginx ingress. This is because there are some custom settings that increased the header size as well as the body size for uploads. These are not available with the standard ingress so the nginx one needed to be used.

It was well communicated that the current version of the ingress was causing the cluster to not be upgraded and the plan was to do some planed downtime one saturday evening, until the cluster was accidentally upgraded unscheduled. So a call was recieved in the middle of the morning after the ingress was failing due to being no longer compatible with the new cluster version.

I had already upgraded the ingress charts as was specified by GKE's notification, so hopefully the only thing that was needed was to upgrade the ingress controller chart and then everything would be fine.

Check the current helm chart

$ kubens ingress-nginx

$ helm ls --all
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
ingress-nginx   ingress-nginx   1               2021-08-06 22:32:34.231405256 +0000 UTC deployed        ingress-nginx-3.35.0    0.48.1

We are on version 3.35.0

Update the helm repo to get all versions

$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ingress-nginx" chart repository
Update Complete. ⎈Happy Helming!⎈

That's good.

Check available versions

$ helm search repo ingress-nginx --versions
NAME                            CHART VERSION   APP VERSION     DESCRIPTION
ingress-nginx/ingress-nginx     4.6.1           1.7.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.6.0           1.7.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.5.2           1.6.4           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.5.0           1.6.3           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.4.2           1.5.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.4.1           1.5.2           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.4.0           1.5.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.3.0           1.4.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.2.5           1.3.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.2.4           1.3.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.2.3           1.3.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.2.2           1.3.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.2.1           1.3.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.2.0           1.3.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.1.4           1.2.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.1.3           1.2.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.1.2           1.2.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.1.1           1.2.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.1.0           1.2.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.19          1.1.3           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.18          1.1.2           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.17          1.1.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.16          1.1.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.15          1.1.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.13          1.1.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.12          1.1.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.11          1.1.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.10          1.1.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.9           1.0.5           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.8           1.0.5           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.6           1.0.4           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.5           1.0.3           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.3           1.0.2           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.2           1.0.1           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     4.0.1           1.0.0           Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.41.0          0.51.0          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.40.0          0.50.0          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.39.0          0.49.3          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.38.0          0.49.2          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.37.0          0.49.1          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.36.0          0.49.0          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.35.0          0.48.1          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.34.0          0.47.0          Ingress controller for Kubernetes using NGINX a...
ingress-nginx/ingress-nginx     3.33.0          0.47.0          Ingress controller for Kubernetes using NGINX a...
... trimmed for brevity

We are at version 3.35.0 and we want to be at version 4.6.1.

Upgrade the helm chart to latest:

$ helm upgrade ingress-nginx ingress-nginx/ingress-nginx
Release "ingress-nginx" has been upgraded. Happy Helming!
NAME: ingress-nginx
LAST DEPLOYED: Sat May 20 08:38:01 2023
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace ingress-nginx get services -o wide -w ingress-nginx-controller'

An example Ingress that makes use of the controller:
  apiVersion: networking.k8s.io/v1
  kind: Ingress
  metadata:
    name: example
    namespace: foo
  spec:
    ingressClassName: nginx
    rules:
      - host: www.example.com
        http:
          paths:
            - pathType: Prefix
              backend:
                service:
                  name: exampleService
                  port:
                    number: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
    tls:
      - hosts:
        - www.example.com
        secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
  metadata:
    name: example-tls
    namespace: foo
  data:
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls

That looks good. Our ingresses were already updated.

Recheck current version after upgrade

$ helm ls

NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
ingress-nginx   ingress-nginx   2               2023-05-20 08:38:01.9136279 -0700 PDT   deployed        ingress-nginx-4.6.1     1.7.1

AAR

Went from chart version 3.35.0 to version 4.6.1. There were a few issues listed in troubleshooting.

Troubleshooting

Failed upgrade due to network failure.

During the initial upgrade, there was a failure in network connectivity, resulting in the upgrade being stuck:

$ helm upgrade ingress-nginx ingress-nginx/ingress-nginx

network lost here

On retry, an UPGRADE FAILED was encountered:

$ helm upgrade ingress-nginx ingress-nginx/ingress-nginx
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

Checking the current chart show's pending-upgrade status

$ helm ls --all
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
ingress-nginx   ingress-nginx   2               2023-05-20 08:06:02.3069687 -0700 PDT   pending-upgrade ingress-nginx-4.6.1     1.7.1

Try 1: Check for failures to see if something is stopping the upgrade, like an unresponsive pod

What's going on? My first idea was that there were failures terminating pods resulting in the new ones not getting deployed. Checking the pods showed two replicasets that were both failing:

$ kubectl get pods
NAME                                       READY   STATUS             RESTARTS          AGE
ingress-nginx-admission-patch-2gjvk        0/1     CrashLoopBackOff   5 (113s ago)      4m58s
ingress-nginx-controller-b65df6fbb-x7d98   0/1     Running            164 (5m31s ago)   9h
ingress-nginx-controller-b74d477bf-s4p9t   0/1     Running            4 (21s ago)       5m3s

Checking the logs of the failing pods showed they both had errors with from the current version. So they are not even the new version. How do we stop these so the upgrade can continue? First idea was just to delete the pods/replicasets/deployments and bring it down.

$ kubectl get replicasets
NAME                                 DESIRED   CURRENT   READY   AGE
ingress-nginx-controller-b65df6fbb   1         1         0       651d
ingress-nginx-controller-b74d477bf   1         1         0       5m55s

I deleted both replicasets, and brought the deployment to 0 replicas to stop the failure, but that didn't work to allow the upgrade to finish.

$ kubectl delete replicaset ingress-nginx-controller-b65df6fbb
replicaset.apps "ingress-nginx-controller-b65df6fbb" deleted
$ kubectl delete replicaset ingress-nginx-controller-b74d477bf
replicaset.apps "ingress-nginx-controller-b74d477bf" deleted
$ kubectl edit deployment ingress-nginx-controller
changed replicas to 0

Even with no pods, no replicasets, the upgrade wouldn't continue.

Try 2: Edit the release secret to mark it as deployed then deploy again.

$ kubectl get secrets
NAME                                  TYPE                                  DATA   AGE
sh.helm.release.v1.ingress-nginx.v1   helm.sh/release.v1                    1      651d
sh.helm.release.v1.ingress-nginx.v2   helm.sh/release.v1                    1      11m

$ kubectl edit secret sh.helm.release.v1.ingress-nginx.v2

made change to ingress secret to make it deployed

Result: It didn't work. The chart was still pending-upgrade.

Try 3: Delete the v2 deployment secret

$ kubectl delete secrets sh.helm.release.v1.ingress-nginx.v2
secret "sh.helm.release.v1.ingress-nginx.v2" deleted

$ helm ls
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
ingress-nginx   ingress-nginx   1               2021-08-06 22:32:34.231405256 +0000 UTC deployed        ingress-nginx-3.35.0    0.48.1

Worked!!

Now try to upgrade again!

Upgrade failed due to service account

$ helm upgrade --reuse-values ingress-nginx ingress-nginx/ingress-nginx Error: UPGRADE FAILED: template: ingress-nginx/templates/admission-webhooks/job-patch/serviceaccount.yaml:1:119: executing "ingress-nginx/templates/admission-webhooks/job-patch/serviceaccount.yaml" at <.Values.controller.admissionWebhooks.certManager.enabled>: nil pointer evaluating interface {}.enabled

Try 1: Check service account values

Check the service account and see if anything is wrong.

$ kubectl get sa ingress-nginx-admission -o yaml

Result: nothing looks wrong, it matches the other service accounts that have no issues.

Try 2: Delete the errant service account.

Delete and let upgrade regenerate when helm applys the yaml.

$ kubectl get sa
NAME                      SECRETS   AGE
default                   1         651d
ingress-nginx             1         651d
ingress-nginx-admission   0         30m

There is a service account created from the failed upgrade previously done.

$ kubectl delete sa ingress-nginx-admission
serviceaccount "ingress-nginx-admission" deleted

Result:

Worked great, the next upgrade succeeded.