Work with multi-architecture images - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Work with multi-architecture images

Amazon EMR on EKS supports multi-architecture container images for Amazon Elastic Container Registry (Amazon ECR). For more information, see Introducing multi-architecture container images for Amazon ECR.

Amazon EMR on EKS custom images support both Amazon Graviton-based EC2 instances and non-Graviton-based EC2 instances. The Graviton-based images are stored in the same image repositories in Amazon ECR as non-Graviton-based images.

For example, to inspect the Docker manifest list for 6.6.0 images, run the following command.

docker manifest inspect 895885662937.dkr.ecr.us-west-2.amazonaws.com/spark/emr-6.6.0:latest

Here is the output. The arm64 architecture is for Graviton instance. The amd64 is for non-Graviton instance.

{ "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "size": 1805, "digest": "xxx123:6b971cb47d11011ab3d45fff925e9442914b4977ae0f9fbcdcf5cfa99a7593f0", "platform": { "architecture": "arm64", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "size": 1805, "digest": "xxx123:6f2375582c9c57fa9838c1d3a626f1b4fc281e287d2963a72dfe0bd81117e52f", "platform": { "architecture": "amd64", "os": "linux" } } ] }

Take the following steps to create multi-architecture images:

  1. Create a Dockerfile with the following contents so that you can pull the arm64 image.

    FROM --platform=arm64 895885662937.dkr.ecr.us-west-2.amazonaws.com/spark/emr-6.6.0:latest USER root RUN pip3 install boto3 // install customizations here USER hadoop:hadoop
  2. Follow the instructions at Introducing multi-architecture container images for Amazon ECR to build a multi-architecture image.

    Note

    You must create arm64 images on arm64 instances. Similarly, you must build amd64 images on amd64 instances.

    You can also build multi-architecture images without building on each specific instance type with the Docker buildx command. For more information, see Leverage multi-CPU architecture support.

  3. After you build the multi-architecture image, you can submit a job with the same spark.kubernetes.container.image parameter and point it to the image. In a heterogeneous cluster with both Amazon Graviton-based and non-Graviton-based EC2 instances, the instance determines the correct architecture image based on the instance architecture that pulls the image.