App Mesh Kubernetes troubleshooting
Important
End of support notice: On September 30, 2026, Amazon will discontinue support for Amazon App Mesh. After September 30, 2026, you will no longer be able to access the Amazon App Mesh console or Amazon App Mesh resources. For more information, visit this blog post Migrating from Amazon App Mesh to Amazon ECS Service Connect
This topic details common issues that you may experience when you use App Mesh with Kubernetes.
App Mesh resources created in Kubernetes cannot be found in App Mesh
Symptoms
You have created the App Mesh resources using the Kubernetes custom resource definition (CRD), but the resources that you created are not visible in App Mesh when you use the Amazon Web Services Management Console or APIs.
Resolution
The likely cause is an error in the Kubernetes controller for App Mesh. For more information,
see Troubleshooting
kubectl logs -n appmesh-system -f \ $(kubectl get pods -n appmesh-system -o name | grep controller)
If your issue is still not resolved, then consider opening a GitHub issue
Pods are failing readiness and liveliness checks after Envoy sidecar is injected
Symptoms
Pods for your application were previously running successfully, but after the Envoy sidecar is injected into a pod, readiness and liveliness checks begin failing.
Resolution
Make sure that the Envoy container that was injected into the pod has bootstrapped with App Mesh’s Envoy management service. You can verify any errors by referencing the error codes in Envoy disconnected from App Mesh Envoy management service with error text. You can use the following command to inspect Envoy logs for the relevant pod.
kubectl logs -n appmesh-system -f \ $(kubectl get pods -n appmesh-system -o name | grep controller) \ | grep "gRPC config stream closed"
If your issue is still not resolved, then consider opening a GitHub issue
Pods not registering or deregistering as Amazon Cloud Map instances
Symptoms
Your Kubernetes pods are not being registered in or de-registered from Amazon Cloud Map as part of their life cycle. A pod may start successfully and be ready to serve traffic, but not receive any. When a pod is terminated, clients may still retain its IP address and attempt to send traffic to it, failing.
Resolution
This is a known issue. For more information, see the Pods don't get auto
registered/deregistered in Kubernetes with Amazon Cloud Map
To mitigate this issue:
-
Make sure that you are running the latest version of the App Mesh controller for Kubernetes.
-
Make sure that the Amazon Cloud Map
namespaceName
andserviceName
are correct in your virtual node definition. -
Make sure that you delete any associated pods prior to deleting your virtual node definition. If you need help identifying which pods are associated with a virtual node, see Cannot determine where a pod for an App Mesh resource is running.
-
If your issue persists, run the following command to inspect your controller logs for errors that may help reveal the underlying issue.
kubectl logs -n appmesh-system \ $(kubectl get pods -n appmesh-system -o name | grep appmesh-controller)
-
Consider using the following command to restart your controller pods. This may fix synchronization issues.
kubectl delete -n appmesh-system \ $(kubectl get pods -n appmesh-system -o name | grep appmesh-controller)
If your issue is still not resolved, then consider opening a GitHub issue
Cannot determine where a pod for an App Mesh resource is running
Symptoms
When you run App Mesh on a Kubernetes cluster, an operator cannot determine where a workload, or pod, is running for a given App Mesh resource.
Resolution
Kubernetes pod resources are annotated with the mesh and virtual node that they are associated to. You can query which pods are running for a given virtual node name with the following command.
kubectl get pods --all-namespaces -o json | \ jq '.items[] | { metadata } | select(.metadata.annotations."appmesh.k8s.aws/virtualNode" == "
virtual-node-name
")'
If your issue is still not resolved, then consider opening a GitHub issue
Cannot determine what App Mesh resource a pod is running as
Symptoms
When running App Mesh on a Kubernetes cluster, an operator cannot determine what App Mesh resource a given pod is running as.
Resolution
Kubernetes pod resources are annotated with the mesh and virtual node that they are associated to. You can output the mesh and virtual node names by querying the pod directly using the following command.
kubectl get pod
pod-name
-nnamespace
-o json | \ jq '{ "mesh": .metadata.annotations."appmesh.k8s.aws/mesh", "virtualNode": .metadata.annotations."appmesh.k8s.aws/virtualNode" }'
If your issue is still not resolved, then consider opening a GitHub issue
Client Envoys are not able to communicate with App Mesh Envoy Management Service with IMDSv1 disabled
Symptoms
When IMDSv1
is disabled, client Envoys aren't able to communicate with the
App Mesh control plane (Envoy Management Service). IMDSv2
support is not available
on App Mesh Envoy version before v1.24.0.0-prod
.
Resolution
To resolve this issue, you can do one of these three things.
-
Upgrade to App Mesh Envoy version
v1.24.0.0-prod
or later, which hasIMDSv2
support. -
Re-enable
IMDSv1
on the Instance where Envoy is running. For instructions on restoringIMDSv1
, see Configure the instance metadata options. -
If your services are running on Amazon EKS, it is recommended to use IAM roles for service accounts (IRSA) for fetching credentials. For instructions to enable IRSA, see IAM roles for service accounts.
If your issue is still not resolved, then consider opening a GitHub issue
IRSA does not work on application container when App Mesh is enabled and Envoy is injected
Symptoms
When App Mesh is enabled on an Amazon EKS cluster with the help of the App Mesh controller for Amazon EKS,
Envoy and proxyinit
containers are injected into the application pod. The
application is not able to assume IRSA
and instead assumes the node
role
. When we describe the pod details, we then see that either the
AWS_WEB_IDENTITY_TOKEN_FILE
or AWS_ROLE_ARN
environment variable
are not included in the application container.
Resolution
If either AWS_WEB_IDENTITY_TOKEN_FILE
or AWS_ROLE_ARN
environment
variables are defined, then the webhook will skip the pod. Don't provide either of these
variables and the webhook will take care of injecting them for you.
reservedKeys := map[string]string{ "AWS_ROLE_ARN": "", "AWS_WEB_IDENTITY_TOKEN_FILE": "", } ... for _, env := range container.Env { if _, ok := reservedKeys[env.Name]; ok { reservedKeysDefined = true }
If your issue is still not resolved, then consider opening a GitHub issue