Troubleshoot EKS Auto Mode - Amazon EKS
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Help improve this page

Want to contribute to this user guide? Choose the Edit this page on GitHub link that is located in the right pane of every page. Your contributions will help make our user guide better for everyone.

Troubleshoot EKS Auto Mode

With EKS Auto Mode, Amazon assumes more responsibility for EC2 Instances in your Amazon account. EKS assumes responsibility for the container runtime on nodes, the operating system on the nodes, and certain controllers. This includes a block storage controller, a load balancing controller, and a compute controller.

You must use Amazon and Kubernetes APIs to troubleshoot nodes. You can:

  • Use a Kubernetes NodeDiagnostic resource to retrieve node logs.

  • Use the Amazon EC2 CLI command get-console-output to retrieve console output from nodes.

Note

EKS Auto Mode uses EC2 managed instances. You cannot directly access EC2 managed instances, including by SSH.

If you have a problem with a controller, you should research:

  • If the resources associated with that controller are properly formatted and valid.

  • If the Amazon IAM and Kubernetes RBAC resources are properly configured for your cluster. For more information, see Learn about identity and access in EKS Auto Mode.

Node monitoring agent

EKS Auto Mode includes the Amazon EKS node monitoring agent. You can use this agent to view troubleshooting and debugging information about nodes. The node monitoring agent publishes Kubernetes events and node conditions. For more information, see Enable node auto repair and investigate node health issues.

Get console output from an EC2 managed instance by using the Amazon EC2 CLI

This procedure helps with troubleshooting boot-time or kernel-level issues.

First, you need to determine the EC2 Instance ID of the instance associated with your workload. Second, use the Amazon CLI to retrieve the console output.

  1. Confirm you have kubectl installed and connected to your cluster

  2. (Optional) Use the name of a Kubernetes Deployment to list the associated pods.

    kubectl get pods -l app=<deployment-name>
  3. Use the name of the Kubernetes Pod to determine the EC2 instance ID of the associated node.

    kubectl get pod <pod-name> -o wide
  4. Use the EC2 instance ID to retrieve the console output.

    aws ec2 get-console-output --instance-id <instance id> --latest --output text

Get node logs by using the kubectl CLI

For information about getting node logs, see Retrieve node logs for a managed node using kubectl and S3.

View resources associated with EKS Auto Mode in the Amazon Console

You can use the Amazon console to view the status of resources associated with your EKS Auto Mode cluster.

  • EBS Volumes

    • View EKS Auto Mode volumes by searching for the tag key eks:eks-cluster-name

  • Load Balancers

    • View EKS Auto Mode load balancers by searching for the tag key eks:eks-cluster-name

  • EC2 Instances

    • View EKS Auto Mode instances by searching for the tag key eks:eks-cluster-name

View IAM Errors in your Amazon account

  1. Navigate to CloudTrail console

  2. Select "Event History" from the left navigation pane

  3. Apply error code filters:

    • AccessDenied

    • UnauthorizedOperation

    • InvalidClientTokenId

Look for errors related to your EKS cluster. Use the error messages to update your EKS access entries, Cluster IAM Role, or Node IAM Role. You may need to attach a new policy to these roles with permissions for EKS Auto Mode.

Pod failing to schedule onto Auto Mode node

If pods are not being scheduled onto an auto mode node, verify if your pod/deployment manifest has a nodeSelector. If a nodeSelector is present, please ensure it is using eks.amazonaws.com/compute-type: auto to allow it to be scheduled. See Control if a workload is deployed on EKS Auto Mode nodes.

Node not joining cluster

Run kubectl get nodeclaim to check for nodeclaims that are Ready = False.

Proceed to run kubectl describe nodeclaim <node_claim> and look under Status to find any issues preventing the node from joining the cluster.

Common error messages:

  • "Error getting launch template configs"

  • "Error creating fleet"

    • There may be some authorization issue with calling the RunInstances API call. Check CloudTrail for errors and see Amazon EKS Auto Mode cluster IAM role for the required IAM permissions.