Using Apache Iceberg with Amazon EMR on EKS
The runtime JAR for Iceberg contains the necessary Iceberg classes for Spark runtime support. The following procedure shows how to start a job run using the Iceberg spark runtime.
To use Apache Iceberg with Amazon EMR on EKS applications
-
When you start a job run to submit a Spark job in the application configuration, include the Iceberg spark runtime JAR file:
--job-driver '{"sparkSubmitJobDriver" : {"sparkSubmitParameters" : "--jars local:///usr/share/aws/iceberg/lib/iceberg-spark3-runtime.jar"}}'
-
Include Iceberg additional configuration:
--configuration-overrides '{ "applicationConfiguration": [ "classification" : "spark-defaults", "properties" : { "spark.sql.catalog.dev.warehouse" : "s3://DOC-EXAMPLE-BUCKET/EXAMPLE-PREFIX/ ", "spark.sql.extensions ":" org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions ", "spark.sql.catalog.dev" : "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.dev.catalog-impl" : "org.apache.iceberg.aws.glue.GlueCatalog", "spark.sql.catalog.dev.io-impl": "org.apache.iceberg.aws.s3.S3FileIO" } ] }'
To learn more about Apache Iceberg release versions of EMR, see Iceberg release history.