App Mesh security troubleshooting
Important
End of support notice: On September 30, 2026, Amazon will discontinue support for Amazon App Mesh. After September 30, 2026, you will no longer be able to access the Amazon App Mesh console or Amazon App Mesh resources. For more information, visit this blog post Migrating from Amazon App Mesh to Amazon ECS Service Connect
This topic details common issues that you may experience with App Mesh security.
Unable to connect to a backend virtual service with a TLS client policy
Symptoms
When adding a TLS client policy to a virtual service backend in a virtual node, connectivity
to that backend fails. When attempting to send traffic to the backend service, the requests
fail with an HTTP 503
response code and the error message: upstream connect
error or disconnect/reset before headers. reset reason: connection failure
.
Resolution
In order to determine the root cause of the issue, we recommend using the Envoy proxy process logs to help you diagnose the issue. For more information, see Enable Envoy debug logging in pre-production environments. Use the following list to determine the cause of the connection failure:
-
Make sure connectivity to the backend is succeeding by ruling out the errors mentioned in Unable to connect to a virtual service backend.
-
In the Envoy process logs, look for the following errors (logged at debug level).
TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
This error is caused by one or more of the following reasons:
-
The certificate was not signed by one of the certificate authorities defined in the TLS client policy trust bundle.
-
The certificate is no longer valid (expired).
-
The Subject Alternative Name (SAN) does not match the requested DNS hostname.
-
Make sure that the certificate offered by the backend service is valid, that it is signed by one of the certificate authorities in your TLS client policies trust bundle, and that it meets the criteria defined in Transport Layer Security (TLS).
-
If the error you receive is like the one below, then that means the request is bypassing the Envoy proxy and reaching the application directly. When sending traffic, the stats on Envoy don't change indicating that Envoy isn't on the path to decrypt the traffic. In the proxy configuration of the virtual node, make sure the
AppPorts
contains the correct value that the application is listening on.upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
-
If your issue is still not resolved, then consider opening a GitHub issue
Unable to connect to a backend virtual service when application is originating TLS
Symptoms
When originating a TLS session from an application, instead of from the Envoy proxy, connectivity to a backend virtual service fails.
Resolution
This is a known issue. For more information, see the Feature Request: TLS
negotiation between the downstream application and upstream proxy
If your issue is still not resolved, then consider opening a GitHub issue
Unable to assert that connectivity between Envoy proxies is using TLS
Symptoms
Your application has enabled TLS termination on the virtual node or virtual gateway listener, or TLS origination on the backend TLS client policy, but you are unable to assert that connectivity between Envoy proxies is occurring over a TLS-negotiated session.
Resolution
Steps defined in this resolution make use of the Envoy administration interface and Envoy statistics. For help configuring these, see Enable the Envoy proxy administration interface and Enable Envoy DogStatsD integration for metric offload. The following statistics examples use the administration interface for simplicity.
-
For the Envoy proxy performing TLS termination:
-
Make sure that the TLS certificate has been bootstrapped in the Envoy configuration with the following command.
curl http://my-app.default.svc.cluster.local:9901/certs
In the returned output, you should see at least one entry under
certificates[].cert_chain
for the certificate used in TLS termination. -
Make sure that the number of successful inbound connections to the proxy’s listener is exactly the same as the number of SSL handshakes plus the number of SSL sessions re-used, as shown by the following example commands and output.
curl -s http://
listener.0.0.0.0_15000.downstream_cx_total: 11my-app.default.svc.cluster.local
:9901
/stats | grep "listener.0.0.0.0_15000" | grep downstream_cx_totalcurl -s http://
listener.0.0.0.0_15000.ssl.connection_error: 1my-app.default.svc.cluster.local
:9901
/stats | grep "listener.0.0.0.0_15000" | grep ssl.connection_errorcurl -s http://
listener.0.0.0.0_15000.ssl.handshake: 9my-app.default.svc.cluster.local
:9901
/stats | grep "listener.0.0.0.0_15000" | grep ssl.handshakecurl -s http://
listener.0.0.0.0_15000.ssl.session_reused: 1 # Total CX (11) - SSL Connection Errors (1) == SSL Handshakes (9) + SSL Sessions Re-used (1)my-app.default.svc.cluster.local
:9901
/stats | grep "listener.0.0.0.0_15000" | grep ssl.session_reused
-
-
For the Envoy proxy performing TLS origination:
-
Make sure that the TLS trust store has been bootstrapped in the Envoy configuration with the following command.
curl http://my-app.default.svc.cluster.local:9901/certs
You should see at least one entry under
certificates[].ca_certs
for the certificates used in validating the backend’s certificate during TLS origination. -
Make sure that the number of successful outbound connections to the backend cluster is exactly the same as the number of SSL handshakes plus the number of SSL sessions re-used, as shown by the following example commands and output.
curl -s http://
cluster.cds_egress_my-app.default.svc.cluster.local
:9901
/stats | grep "virtual-node-name
" | grep upstream_cx_totalmesh-name
_virtual-node-name
_protocol
_port
.upstream_cx_total: 11curl -s http://
cluster.cds_egress_my-app.default.svc.cluster.local
:9901
/stats | grep "virtual-node-name
" | grep ssl.connection_errormesh-name
_virtual-node-name
_protocol
_port
.ssl.connection_error: 1curl -s http://
cluster.cds_egress_my-app.default.svc.cluster.local
:9901
/stats | grep "virtual-node-name
" | grep ssl.handshakemesh-name
_virtual-node-name
_protocol
_port
.ssl.handshake: 9curl -s http://
cluster.cds_egress_my-app.default.svc.cluster.local
:9901
/stats | grep "virtual-node-name
" | grep ssl.session_reusedmesh-name
_virtual-node-name
_protocol
_port
.ssl.session_reused: 1 # Total CX (11) - SSL Connection Errors (1) == SSL Handshakes (9) + SSL Sessions Re-used (1)
-
If your issue is still not resolved, then consider opening a GitHub issue
Troubleshooting TLS with Elastic Load Balancing
Symptoms
When attempting to configure an Application Load Balancer or Network Load Balancer to encrypt traffic to a virtual node, connectivity and load balancer health checks can fail.
Resolution
In order to determine the root cause of the issue, you need to check the following:
-
For the Envoy proxy performing TLS termination, you need to rule out any misconfiguration. Follow the steps provided above in the Unable to connect to a backend virtual service with a TLS client policy.
-
For the load balancer, you need to look at the configuration of the
TargetGroup:
-
Make sure that the
TargetGroup
port matches the virtual node’s defined listener port. -
For Application Load Balancers that are originating TLS connections over HTTP to your service, make sure that the
TargetGroup
protocol is set toHTTPS
. If health checks are being utilized, make sure thatHealthCheckProtocol
is set toHTTPS
. -
For Network Load Balancers that are originating TLS connections over TCP to your service, make sure that the
TargetGroup
protocol is set toTLS
. If health checks are being utilized, make sure thatHealthCheckProtocol
is set toTCP
.Note
Any updates to
TargetGroup
require changing theTargetGroup
name.
-
With this configured properly, your load balancer should provide a secure connection to your service using the certificate provided to the Envoy proxy.
If your issue is still not resolved, then consider opening a GitHub issue