

# Automated job retries
<a name="job_retries"></a>

You can apply a retry strategy to your jobs and job definitions that allows failed jobs to be automatically retried. Possible failure scenarios include the following:
+ Any non-zero exit code from a container job
+ Amazon EC2 instance failure or termination
+ Internal Amazon service error or outage

When a job is submitted to a job queue and placed into the `RUNNING` state that's considered an attempt. By default, each job is given one attempt to move to either the `SUCCEEDED` or `FAILED` job state. However, both the job definition and the job submission workflows can be used to specify a retry strategy with between 1 and 10 attempts. If [evaluateOnExit](job_definition_parameters.md#retryStrategy-evaluateOnExit) is specified, it can contain up to 5 retry strategies. If [evaluateOnExit](https://docs.amazonaws.cn/batch/latest/APIReference/API_EvaluateOnExit.html) is specified, but none of the retry strategies match, then the job is retried. For jobs that don't match to exit, add a final entry that exits for any reason. For example, this `evaluateOnExit` object has two entries that with actions of `RETRY` and a final entry with an action of `EXIT`.

```
"evaluateOnExit": [
    {
        "action": "RETRY",
        "onReason": "AGENT"
    },
    {
        "action": "RETRY",
        "onStatusReason": "Task failed to start"
    },
    {
        "action": "EXIT",
        "onReason": "*"
    }
]
```

At runtime, the `AWS_BATCH_JOB_ATTEMPT` environment variable is set to the container's corresponding job attempt number. The first attempt is numbered `1`, and subsequent attempts are in ascending order (for example, 2, 3, 4).

For example, suppose that a job attempt fails for any reason and the number of attempts specified in the retry configuration is greater than the `AWS_BATCH_JOB_ATTEMPT` number. Then, the job is placed back in the `RUNNABLE` state. For more information, see [Job states](job_states.md).

**Note**  
Jobs that are cancelled or terminated aren't retried. Also, jobs that fail because of an invalid job definition aren't retried.

For more information, see [Retry strategy](job_definition_parameters.md#retryStrategy), [Create a single-node job definition](create-job-definition.md), [Tutorial: submit a job](submit_job.md) and [Stopped tasks error codes](https://docs.amazonaws.cn/AmazonECS/latest/userguide/stopped-task-error-codes.html).