Public bug reported: Customer is working on a POC to test EKS in the me-central-1 region and they shared EC2 instances based of the Ubuntu EKS Optimized AMI failed to join cluster when using managed node groups.
I've been able to repeat and identify the issue with the following steps: 1. Created a new EKS cluster in me-central-1 with the following cluster configuration: --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: uae-poc-test region: me-central-1 managedNodeGroups: - name: custom-ng-2 minSize: 1 maxSize: 4 amiFamily: Ubuntu2004 2. The CloudFormation stack rolls back due to Ubuntu node unable to join cluster. It used this AMI: ami-06114b38b9273f7c2. 3. Looking the Cloud init logs we can see the following 403 error on the pause container: Cloud-init v. 22.4.2-0ubuntu0~20.04.2 running 'modules:config' at Wed, 11 Jan 2023 14:43:31 +0000. Up 40.30 seconds. eksctl: running /etc/eks/bootstrap Aliasing EKS k8s snap commands Added: - kubelet-eks.kubelet as kubelet Added: - kubectl-eks.kubectl as kubectl Stopping k8s daemons until configured Stopped. Cluster "kubernetes" set. Container runtime is containerd Attempt 5 of 5 ctr: failed to resolve reference "602401143452.dkr.ecr.me-central-1.amazonaws.com/eks/pause:3.5": pulling from host 602401143452.dkr.ecr.me-central-1.amazonaws.com failed with status code [manifests 3.5]:403 Forbidden Based on the Amazon container image registries (https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html), it looks like it's using the wrong AWS region ECR registry as by specifying the AMI used by the managed node group and overriding --pause-container-account in the bootstrap command as per the below configuration, the node registers as expected. --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: uae-poc-test region: me-central-1 managedNodeGroups: - name: custom-ng-3 ami: ami-06114b38b9273f7c2 minSize: 1 maxSize: 4 overrideBootstrapCommand: | #!/bin/bash /etc/eks/bootstrap.sh <cluster> --pause-container-account 759879836304 ** Affects: compiz-plugins-main (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of DX Packages, which is subscribed to compiz-plugins-main in Ubuntu. Matching subscriptions: dx-packages https://bugs.launchpad.net/bugs/2002659 Title: Ubuntu AMI (ami-06114b38b9273f7c2) failed to join cluster in UAE region due to 403 on pause container Status in compiz-plugins-main package in Ubuntu: New Bug description: Customer is working on a POC to test EKS in the me-central-1 region and they shared EC2 instances based of the Ubuntu EKS Optimized AMI failed to join cluster when using managed node groups. I've been able to repeat and identify the issue with the following steps: 1. Created a new EKS cluster in me-central-1 with the following cluster configuration: --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: uae-poc-test region: me-central-1 managedNodeGroups: - name: custom-ng-2 minSize: 1 maxSize: 4 amiFamily: Ubuntu2004 2. The CloudFormation stack rolls back due to Ubuntu node unable to join cluster. It used this AMI: ami-06114b38b9273f7c2. 3. Looking the Cloud init logs we can see the following 403 error on the pause container: Cloud-init v. 22.4.2-0ubuntu0~20.04.2 running 'modules:config' at Wed, 11 Jan 2023 14:43:31 +0000. Up 40.30 seconds. eksctl: running /etc/eks/bootstrap Aliasing EKS k8s snap commands Added: - kubelet-eks.kubelet as kubelet Added: - kubectl-eks.kubectl as kubectl Stopping k8s daemons until configured Stopped. Cluster "kubernetes" set. Container runtime is containerd Attempt 5 of 5 ctr: failed to resolve reference "602401143452.dkr.ecr.me-central-1.amazonaws.com/eks/pause:3.5": pulling from host 602401143452.dkr.ecr.me-central-1.amazonaws.com failed with status code [manifests 3.5]:403 Forbidden Based on the Amazon container image registries (https://docs.aws.amazon.com/eks/latest/userguide/add-ons- images.html), it looks like it's using the wrong AWS region ECR registry as by specifying the AMI used by the managed node group and overriding --pause-container-account in the bootstrap command as per the below configuration, the node registers as expected. --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: uae-poc-test region: me-central-1 managedNodeGroups: - name: custom-ng-3 ami: ami-06114b38b9273f7c2 minSize: 1 maxSize: 4 overrideBootstrapCommand: | #!/bin/bash /etc/eks/bootstrap.sh <cluster> --pause-container-account 759879836304 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/compiz-plugins-main/+bug/2002659/+subscriptions -- Mailing list: https://launchpad.net/~dx-packages Post to : dx-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~dx-packages More help : https://help.launchpad.net/ListHelp