Tutorial: Getting started with Amazon EC2 orchestration - Amazon Batch
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Tutorial: Getting started with Amazon EC2 orchestration

Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity in the Amazon Web Services Cloud. Using Amazon EC2 eliminates your need to invest in hardware up front, so you can develop and deploy applications faster.

You can use Amazon EC2 to launch as many or as few virtual servers as you need, configure security and networking, and manage storage. Amazon EC2 enables you to scale up or down to handle changes in requirements or spikes in popularity, reducing your need to forecast traffic.

Create a compute environment

To create a compute environment for an Amazon EC2 orchestration, do the following:

  1. Open the Amazon Batch console first-run wizard.

  2. For Select orchestration type, choose Amazon Elastic Compute Cloud(Amazon EC2).

  3. Choose Next.

  4. In the Compute environment configuration section for Name, specify a unique name for your compute environment. The name can be up to 128 characters in length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).

  5. For Instance role, choose an existing instance profile that has the required IAM permissions attached. This instance profile allows the Amazon ECS container instances in your compute environment to make calls to the required Amazon API operations. For more information, see Amazon ECS instance role.

  6. (Optional) A tag is a label that's assigned to a resource. To add a tag or an Amazon EC2 tag, expand Tags, then choose Add tag. Enter a key-value pair, and then choose Add tag again.

    Important

    If you choose Add tag, you must enter a key-value pair and choose Add tag again or choose Remove tag.

  7. (Optional) In the Instance configuration section for Use Amazon EC2 Spot instances, turn on Enable using Spot instances.

  8. (Spot only) For Maximum % on-demand price, enter the maximum percentage of On-demand pricing that you want to pay for Spot resources.

  9. (Optional) (Spot only) For Spot fleet role, choose an existing Amazon EC2 Spot Fleet IAM role to apply to your Spot compute environment. If you don't already have an existing Amazon EC2 Spot Fleet IAM role, you must create one first. For more information, see Amazon EC2 spot fleet role.

    Important

    To tag your Spot Instances on creation, your Amazon EC2 Spot Fleet IAM role must use the newer AmazonEC2SpotFleetTaggingRole managed policy. The AmazonEC2SpotFleetRole managed policy doesn't have the required permissions to tag Spot Instances. For more information, see Spot Instances not tagged on creation and Tag your resources.

  10. For Minimum vCPUs, choose the minimum number of EC2 vCPUs that your compute environment maintains, regardless of job queue demand.

  11. For Desired vCPUs, choose the number of EC2 vCPUs that your compute environment launches with. As job queue demand increases, Amazon Batch increases the desired number of vCPUs and add EC2 instances. The number of vCPUs can increase up to the maximum number of vCPUs. As demand decreases, Amazon Batch decreases the desired number of vCPUs and remove instances. The number of decrease all the way to the minimum number of vCPUs.

  12. For Maximum vCPUs, choose the maximum number of EC2 vCPUs that your compute environment can scale out to, regardless of job queue demand.

  13. For Allowed instance types, choose the Amazon EC2 instance types that can be launched. You can specify instance families to launch any instance type within those families (for example, c5, c5n, or p3). Or, you can specify specific sizes within a family (such as c5.8xlarge). Metal instance types aren't in the instance families. For example, c5 doesn't include c5.metal. You can also choose optimal to select instance types (from the C4, M4, and R4 instance families) that match the demand of your job queues.

    Note

    When you create a compute environment, the instance types that you select for the compute environment must share the same architecture. For example, you can't mix x86 and ARM instances in the same compute environment.

    Note

    Amazon Batch scales GPUs based on the required amount in your job queues. To use GPU scheduling, the compute environment must include instance types from the p2, p3, p4, p5, g3, g3s, g4, or g5 family.

    Note

    Currently, optimal uses instance types from the C4, M4, and R4 instance families. In Amazon Web Services Regions that don't have instance types from those instance families, instance types from the C5, M5, and R5 instance families are used.

  14. Expand Additional configuration.

  15. (Optional) For Placement group, enter a placement group name to group resources in the compute environment.

  16. (Optional) For EC2 key pair, choose a public and private key pair as security credentials when you connect to the instance. For more information about Amazon EC2 key pairs, see Amazon EC2 key pairs and Linux instances.

  17. For Allocation strategy, choose the allocation strategy to use when selecting instance types from the list of allowed instance types. BEST_FIT_PROGRESSIVE is usually the better choice for EC2 On-Demand compute environments, and SPOT_CAPACITY_OPTIMIZED for EC2 Spot compute environments. For more information, see Instance type allocation strategies for Amazon Batch.

  18. (Optional) For EC2 configuration, choose Add EC2 configuration. Choose Image type and Image ID override values to provide information for Amazon Batch to select Amazon Machine Images (AMIs) for instances in the compute environment. If the Image ID override isn't specified for each Image type, Amazon Batch selects a recent Amazon ECS optimized AMI. If no Image type is specified, the default is a Amazon Linux 2 for non-GPU, non Amazon Graviton instance.

    Important

    To use a custom AMI, choose the image type and then enter the custom AMI ID in the Image ID override box.

    Amazon Linux 2

    Default for all Amazon Graviton-based instance families (for example, C6g, M6g, R6g, and T4g) and can be used for all non-GPU instance types.

    Amazon Linux 2 (GPU)

    Default for all GPU instance families (for example P4 and G4) and can be used for all non Amazon Graviton-based instance types.

    Amazon Linux

    Can be used for non-GPU, non Amazon Graviton instance families. The standard support for Amazon Linux AMI has ended. For more information, see Amazon Linux AMI.

    Note

    The AMI that you choose for a compute environment must match the architecture of the instance types that you want to use for that compute environment. For example, if your compute environment uses A1 instance types, the compute resource AMI that you choose must support Arm instances. Amazon ECS vends both x86 and Arm versions of the Amazon ECS optimized Amazon Linux 2 AMI. For more information, see Amazon ECS optimized Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer Guide.

  19. (Optional) For Launch template, select an existing Amazon EC2 launch template to configure your compute resources. The default version of the template is automatically populated. For more information, see Use Amazon EC2 launch templates with Amazon Batch.

    Note

    In a launch template, you can specify a custom AMI that you created.

  20. (Optional) For Launch template version, enter $Default, $Latest, or a specific version number to use.

    Important

    After the compute environment is created, the launch template version used isn't changed even if the $Default or $Latest version for the launch template is updated. To use a new launch template version, first create a new compute environment, add the new compute environment to the existing job queue. Then, remove the old compute environment from the job queue, and delete the old compute environment.

  21. In the Network configuration section:

    1. For Virtual Private Cloud (VPC) ID, choose an Amazon VPC.

    2. For Subnets, the subnets for your Amazon Web Services account are listed. If you want to create a custom set of subnets, choose Clear subnets, and then choose the subnets that you want.

      Important

      Compute resources must communicate with the Amazon ECS VPC endpoint through a VPC endpoint or multiple public IP address. For more information, see Amazon ECS interface VPC endpoints (Amazon PrivateLink). If your instance doesn't have a VPC endpoint configured or a public IP address, you can use network address translation (NAT). For more information about NAT, see NAT gateways and Create a virtual private cloud .

    3. For Security groups, choose the Amazon EC2 security groups that you want to associate with the instance. If you want to create a custom set of security groups, choose Clear security groups. Then, choose the security groups that you want.

  22. Choose Next.

Create a job queue

A job queue stores your submitted jobs until the Amazon Batch Scheduler runs the job on a resource in your compute environment. For more information, see Job queues

To create a job queue for an Amazon EC2 orchestration, do the following:

  1. In the Job queue configuration section for Name, specify a unique name for your compute environment. The name can be up to 128 characters in length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).

  2. For Priority, enter an integer between 0 and 100 for the job queue.

    Important

    Higher integer values are assigned a higher priority by the Amazon Batch Scheduler.

  3. Choose Next.

Create a job definition

Amazon Batch job definitions specify how jobs are to be run. Even though each job must reference a job definition, many of the parameters that are specified in the job definition can be overridden at runtime.

To create the job definition:

  1. In the General configuration section:

    1. In the General configuration section for Name, specify a unique name for your compute environment. The name can be up to 128 characters in length. The name can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).

    2. (Optional) For Execution timeout, enter the amount of time (in seconds) that an unfinished job terminates after.

      Important

      The minimum timeout is 60 seconds.

    3. (Optional) A tag is a label that's assigned to a resource. To add a tag, expand Tags, then choose Add tag. Enter a key-value pair, and then choose Add tag again.

      Important

      If you choose Add tag, you must enter a key-value pair and choose Add tag again or choose Remove tag.

    4. (Optional) Turn on Propagate tags to propagate tags to the Amazon Elastic Container Service task.

  2. In the Container configuration section:

    1. For Image, enter the name of the image that's used to launch the container. By default, all the images in the Docker Hub registry are available. You can also specify other repositories in repository-url/image:tag format. The parameter can be up to 255 characters in length. The parameter can contain uppercase and lowercase letters, numbers, hyphens (-), underscores (_), colons (:), periods (.), forward slashes (/), and number signs (#). The parameter maps to Image in the Create a container section of the Docker Remote API and the IMAGE parameter of docker run.

      Note

      Docker image architecture must match the processor architecture of the compute resources that they're scheduled on. For example, Arm based Docker images can only run on Arm based compute resources.

      • Images in Amazon ECR Public repositories use the full registry/repository[:tag] or registry/repository[@digest] naming conventions (for example, public.ecr.aws/registry_alias/my-web-app:latest).

      • Images in Amazon ECR repositories use the full registry/repository:tag naming convention (for example, aws_account_id.dkr.ecr.region.amazonaws.com/my-web-app:latest).

      • Images in official repositories on Docker Hub use a single name (for example, ubuntu or mongo).

      • Images in other repositories on Docker Hub are qualified with an organization name (for example, amazon/amazon-ecs-agent).

      • Images in other online repositories are qualified further by a domain name (for example, quay.io/assemblyline/ubuntu).

    2. For Command, specify the command to pass to the container. This parameter maps to Cmd in the Create a container section of the Docker Remote API and the COMMAND parameter to docker run. For more information about the Docker CMD parameter, see https://docs.docker.com/engine/reference/builder/#cmd.

      Note

      You can use parameter substitution default values and placeholders in your command. For more information, see Parameters.

    3. (Optional) For Execution role, specify an IAM role that grants the Amazon ECS container agents permission to make Amazon API calls on your behalf. This feature uses Amazon ECS IAM roles for tasks. For more information, see Amazon ECS task execution IAM roles in the Amazon Elastic Container Service Developer Guide.

    4. (Optional) For Job Role configuration, choose an IAM role that has permissions to the Amazon APIs. This feature uses Amazon ECS IAM roles for tasks. For more information, see IAM Roles for Tasks in the Amazon Elastic Container Service Developer Guide.

      Note

      Only roles that have the Amazon Elastic Container Service Task Role trust relationship are shown here. For more information about creating an IAM role for your Amazon Batch jobs, see Creating an IAM Role and Policy for your Tasks in the Amazon Elastic Container Service Developer Guide.

    5. (Optional) You can add parameters to the job definition as key-value mappings to override the job definition defaults. To add a parameter:

      1. For Parameters, choose Add parameter. Enter a key-value pair and then choose Add parameter again.

        Important

        If you choose Add parameter, you must configure at least one parameter or choose Remove parameter.

    6. In the Environment configuration section for vCPUs, specify the number of vCPUs to reserve for the container. This parameter maps to CpuShares in the Create a container section of the Docker Remote API and the --cpu-shares option to docker run. Each vCPU is equivalent to 1,024 CPU shares.

    7. For Memory, specify the hard limit (in MiB) of memory to present to the job container. If your container attempts to exceed the memory specified here, the container is stopped. This parameter maps to Memory in the Create a container section of the Docker Remote API and the --memory option to docker run.

    8. For Number of GPUs, choose the number of GPUs to reserve for the container.

    9. (Optional) For Environment variables configuration, choose Add environment variables to add environment variables to pass to the container. This parameter maps to Env in the Create a container section of the Docker Remote API and the --env option to docker run.

    10. (Optional) For Secrets, choose Add secret to add secrets as a name-value pairs. These secrets are exposed in the container. For more information, see LogConfiguration:secretOptions.

    11. (Optional) In the Linux configuration section:

      1. For User, enter the user name to use inside the container. This parameter maps to User in the Create a container section of the Docker Remote API and the --user option to docker run.

      2. To give the job container elevated permissions on the host instance (similar to the root user), drag the Privileged slider to the right. This parameter maps to Privileged in the Create a container section of the Docker Remote API and the --privileged option to docker run.

      3. Turn on Enable init process to run an init process inside the container. This process forwards signals and reaps processes.

    12. (Optional) In the Filesystem configuration section:

      1. Turn on Enable read only filesystem to remove write access to the volume.

      2. For Shared memory size, enter the size (in MiB) of the /dev/shm volume.

      3. For Max swap size, enter the total amount of swap memory (in MiB) that the container can use.

      4. For Swappiness enter a value between 0 and 100 to indicate the swappiness behavior of the container. If you don't specify a value and swapping is enabled, the value defaults to 60. For more information, see LinuxParameters:swappiness.

      5. (Optional) Expand Additional configuration.

      6. For Tmpfs, choose Add tmpfs to add a tmpfs mount.

      7. For Devices, choose Add device to add a device:

        1. For Container path, specify the path of in the container instance to expose the device mapped to the host instance. If you keep this blank, the host path is used in the container.

        2. For Host path, specify the path of a device in the host instance.

        3. For Permissions, choose one or more permissions to apply to the device. The available permissions are READ, WRITE, and MKNOD.

      8. (Optional) For Ulimits configuration, choose Add ulimit to add a ulimits value for the container. Enter Name, Soft limit, and Hard limit values, and then choose Add ulimit.

  3. Choose Next.

Create a job

To create a job, do the following:

  1. In the Job configuration section for Name, specify a unique name for the job. The name can be up to 128 characters in length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).

  2. Choose Next.

Review and create

On the Review and create page, review the configuration steps. If you need to make changes, choose Edit. When you're finished, choose Create resources.