Troubleshoot Amazon ECS deployment issues - Amazon CodeDeploy
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Troubleshoot Amazon ECS deployment issues

A timeout occurs while waiting for replacement task set

Problem: You see the following error message while deploying your Amazon ECS application using CodeDeploy:

The deployment timed out while waiting for the replacement task set to become healthy. This time out period is 60 minutes.

Possible cause: This error might occur if there is a mistake in your task definition file or other deployment-related files. For example, if there is a typo in the image field in your task definition file, Amazon ECS will try to pull the wrong container image and continuously fail, causing this error.

Possible fixes and next steps:

  • Fix typographical errors and configuration problems in your task definition file and other files.

  • Check the related Amazon ECS service event and find out why replacement tasks are not becoming healthy. For more information on Amazon ECS events, see Amazon ECS events in the Amazon Elastic Container Service Developer Guide.

  • Check the Amazon ECS troubleshooting section in the Amazon Elastic Container Service Developer Guide for errors related to the messages in the event.

A timeout occurs while waiting for a notification to continue

Problem: You see the following error message while deploying your Amazon ECS application using CodeDeploy:

The deployment timed out while waiting for a notification to continue. This time out period is n minutes.

Possible cause: This error might occur if you specified a wait time in the Specify when to reroute traffic field when you created your deployment group, but the deployment couldn't finish before the wait time expired.

Possible fixes and next steps:

The IAM role does not have enough permissions

Problem: You see the following error message while deploying your Amazon ECS application using CodeDeploy:

The IAM role role-arn does not give you permission to perform operations in the following Amazon service: AWSLambda.

Possible cause: This error might occur if you specified a Lambda function in the AppSpec file's Hooks section, but you did not give CodeDeploy permission to the Lambda service.

Possible fix: Add the lambda:InvokeFunction permission to the CodeDeploy service role. To add this permission, add one of the following Amazon-managed policies to the role: AWSCodeDeployRoleForECS or AWSCodeDeployRoleForECSLimited. For information about these policies and how to add them to the CodeDeploy service role, see Step 2: Create a service role for CodeDeploy.

The deployment timed out while waiting for a status callback

Problem: You see the following error message while deploying your Amazon ECS application using CodeDeploy:

The deployment timed out while waiting for a status callback. CodeDeploy expects a status callback within one hour after a deployment hook is invoked.

Possible cause: This error might occur if you specified a Lambda function in the AppSpec file's Hooks section, but Lambda function could not call the necessary PutLifecycleEventHookExecutionStatus API to return a Succeeded or Failed status to CodeDeploy.

Possible fixes and next steps:

  • Add the codedeploy:putlifecycleEventHookExecutionStatus permission to the Lambda execution role used by the Lambda function that you specified in the AppSpec file. This permission grants the Lambda function the ability to return a status of Succeeded or Failed to CodeDeploy. For more information about the Lambda execution role, see Lambda execution role in the Amazon Lambda User Guide.

  • Check your Lambda function code and execution logs to make sure your Lambda function is calling CodeDeploy's PutLifecycleEventHookExecutionStatus API to inform CodeDeploy about whether the lifecycle validation test Succeeded or Failed. For information about the putlifecycleEventHookExecutionStatus API, see PutLifecycleEventHookExecutionStatus in the Amazon CodeDeploy API Reference. For information about Lambda execution logs, see Accessing Amazon CloudWatch logs for Amazon Lambda.

The deployment failed because one or more of the lifecycle event validation functions failed

Problem: You see the following error message while deploying your Amazon ECS application using CodeDeploy:

The deployment failed because one or more of the lifecycle event validation functions failed.

Possible cause: This error might occur if you specified a Lambda function in the AppSpec file's Hooks section, but the Lambda function returned Failed to CodeDeploy when it called PutLifecycleEventHookExecutionStatus. This failure indicates to CodeDeploy that the lifecycle validation test failed.

Possible next step: Check your Lambda execution logs to see why the validation test code is failing. For information about Lambda execution logs, see Accessing Amazon CloudWatch logs for Amazon Lambda.

The ELB could not be updated due to the following error: Primary taskset target group must be behind listener

Problem: You see the following error message while deploying your Amazon ECS application using CodeDeploy:

The ELB could not be updated due to the following error: Primary taskset target group must be behind listener

Possible cause: This error might occur if you have configured an optional test listener, and it is configured with wrong target group. For more information about the test listener in CodeDeploy, see Before you begin an Amazon ECS deployment and What happens during an Amazon ECS deployment. For more information about task sets, see TaskSet in the Amazon Elastic Container Service API Reference and describe-task-set in the Amazon ECS section of the Amazon CLI Command Reference.

Possible fix: Make sure that the Elastic Load Balancing's production listener and test listener are both pointing to the target group that's currently serving your workloads. There are three places to check:

My deployment sometimes fails when using Auto Scaling

Problem: You are using Auto Scaling with CodeDeploy and you notice that your deployments occasionally fail. For more information about the symptoms of this problem, see the topic that reads For services configured to use service auto scaling and the blue/green deployment type, auto scaling is not blocked during a deployment but the deployment may fail under some circumstances in the Amazon Elastic Container Service Developer Guide.

Possible cause: This problem might occur if CodeDeploy and Auto Scaling processes conflict.

Possible fix: Suspend and resume Auto Scaling processes during the CodeDeploy deployment using the RegisterScalableTarget API (or the corresponding register-scalable-target Amazon CLI command). For more information, see Suspend and resume scaling for Application Auto Scaling in the Application Auto Scaling User Guide.

Note

CodeDeploy can't call RegisterScaleableTarget directly. To use this API, you must configure CodeDeploy to send a notification or event to Amazon Simple Notification Service (or Amazon CloudWatch). You must then configure Amazon SNS (or CloudWatch) to call a Lambda function, and configure the Lambda function to call the RegisterScalableTarget API. The RegisterScalableTarget API must be called with the SuspendedState parameter set to true to suspend Auto Scaling operations, and false to resume them.

The notification or event that CodeDeploy sends out must occur when a deployment starts (to trigger Auto Scaling suspend operations), or when a deployment succeeds, fails, or stops (to trigger Auto Scaling resume operations).

For information about how to configure CodeDeploy to generate Amazon SNS notifications or CloudWatch events, see Monitoring deployments with Amazon CloudWatch Events. and Monitoring Deployments with Amazon SNS Event Notifications.

Only ALB supports gradual traffic routing, use AllAtOnce Traffic routing instead when you create/update Deployment group

Problem: You see the following error message while creating or updating a deployment group in CodeDeploy:

Only ALB supports gradual traffic routing, use AllAtOnce Traffic routing instead when you create/update Deployment group.

Possible cause: This error might occur if you're using a Network Load Balancer and tried to use a predefined deployment configuration other than CodeDeployDefault.ECSAllAtOnce.

Possible fixes:

Even though my deployment succeeded, the replacement task set fails the Elastic Load Balancing health checks, and my application is down

Problem: Even though CodeDeploy indicates that my deployment succeeded, the replacement task set fails the health checks from Elastic Load Balancing, and my application is down.

Possible cause: This issue might occur if you performed a CodeDeploy all-at-once deployment, and your replacement (green) task set contains bad code that is causing the Elastic Load Balancing health checks to fail. With the all-at-once deployment configuration, the load balancer’s health checks start running on the replacement task set after traffic has been shifted to it (that is, after CodeDeploy’s AllowTraffic lifecycle event occurs). That’s why you will see health checks failing on the replacement task set after traffic has shifted, but not before. For information about the lifecycle events that CodeDeploy generates, see What happens during an Amazon ECS deployment.

Possible fixes:

  • Change your deployment configuration from all-at-once to canary or linear. In a canary or linear configuration, the load balancer’s health checks start running on the replacement task set while CodeDeploy installs your application in the replacement environment, and before traffic is shifted (that is, during the Install lifecycle event, and before the AllowTraffic event). By allowing the checks to run during the application installation but before traffic is shifted, bad application code will be detected and cause deployment failures before the application becomes publicly available.

    For information about how to configure canary or linear deployments, see Change deployment group settings with CodeDeploy.

    For information about CodeDeploy lifecycle events that run during an Amazon ECS deployment, see What happens during an Amazon ECS deployment.

    Note

    Canary and linear deployment configurations are only supported with Application Load Balancers.

  • If you want to keep your all-at-once deployment configuration, set up a test listener and check the health status of the replacement task set with the BeforeAllowTraffic lifecycle hook. For more information, see List of lifecycle event hooks for an Amazon ECS deployment.

Can I attach multiple load balancers to a deployment group?

No. If you want to use multiple Application Load Balancers or Network Load Balancers, use Amazon ECS rolling updates instead of CodeDeploy blue/green deployments. For more information about rolling updates, see Rolling update in the Amazon Elastic Container Service Developer Guide. For more information about using multiple load balancers with Amazon ECS, see Registering multiple target groups with a service in the Amazon Elastic Container Service Developer Guide.

Can I perform CodeDeploy blue/green deployments without a load balancer?

No, you cannot perform CodeDeploy blue/green deployments without a load balancer. If you are unable to use a load balancer, use Amazon ECS's rolling updates feature instead. For more information about Amazon ECS rolling updates, see Rolling update in the Amazon Elastic Container Service Developer Guide.

How can I update my Amazon ECS service with new information during a deployment?

To have CodeDeploy update your Amazon ECS service with a new parameter while it conducts a deployment, specify the parameter in the resources section of the AppSpec file. Only a few Amazon ECS parameters are supported by CodeDeploy, such as the task definition file and container name parameters. For a full list of Amazon ECS parameters that CodeDeploy can update, see AppSpec 'resources' section for Amazon ECS deployments.

Note

If you need to update your Amazon ECS service with a parameter that is not supported by CodeDeploy, complete these tasks:

  1. Call Amazon ECS's UpdateService API with the parameter you want to update. For a full list of parameters that can be updated, see UpdateService in the Amazon Elastic Container Service API Reference.

  2. To apply the change to the tasks, create a new Amazon ECS blue/green deployment. For more information, see Create an Amazon ECS Compute Platform deployment (console).