Amazon Glue versions

You can configure the Amazon Glue version parameter when you add or update a job. The Amazon Glue version determines the versions of Apache Spark and Python that Amazon Glue supports. The Python version indicates the version that's supported for jobs of type Spark. The following table lists the available Amazon Glue versions, the corresponding Spark and Python versions, and other changes in functionality.

You can use the Generative AI upgrades for Apache Spark to upgrade your Glue ETL jobs from older Glue versions (≥ 2.0) to the latest Glue version.

Amazon Glue versions

Amazon Glue version	Supported runtime environment versions	Supported Java version	Changes in functionality
Amazon Glue 5.1	Spark 3.5.6 Python 3.11 Scala 2.12.18	Java 17	Amazon Glue 5.1 is the default version for jobs created without specifying an Amazon Glue version. In addition to the spark engine upgrade, there are optimizations and upgrades built into this Amazon Glue release, such as: Open Table Formats (OTF) updated to Hudi 1.0.2, Iceberg 1.10.0, and Delta Lake 3.3.2 Iceberg format version 3.0 Hudi Full Table Access (FTA) with reads and writes Spark-native fine-grained access control (FGAC) DDL/DML operations on Hive, Iceberg, and Delta Lake tables registered in Lake Formation Audit context for Glue and Lake Formation API calls in CloudTrail log Iceberg Materialized Views support
Amazon Glue 5.0	Spark 3.5.4 Python 3.11 Scala 2.12.18	Java 17	In addition to the framework updates, there are optimizations and upgrades built into this Amazon Glue release, such as: Amazon SageMaker Unified Studio support Amazon SageMaker Lakehouse support Open Table Formats (OTF) updated to Hudi 0.15.0, Iceberg 1.7.1, and Delta Lake 3.3.0 Spark-native fine-grained access control using Lake Formation. Amazon S3 Access Grants support `requirements.txt` support to install additional Python libraries Data lineage support in Amazon DataZone Amazon S3 Table Bucket support Amazon Glue Data Catalog multi-dialect view support Limitations The following are limitations with Amazon Glue 5.0: Glue Dynamic Frame / `GlueContext`-based table-level access control with Amazon Lake Formation permissions supported in Glue 4.0 or before is not supported in Glue 5.0. Use the new Spark native fine-grained access control (FGAC) in Glue 5.0. For more information about migrating to Amazon Glue version 5.0, see Migrating Amazon Glue for Spark jobs to Amazon Glue version 5.0.
Amazon Glue 4.0	Spark environment versions Spark 3.3.0 Python 3.10	Java 8	Amazon Glue 4.0 has a number of optimizations and upgrades built into this Amazon Glue release, such as: Many Spark functionality upgrades from Spark 3.1 to Spark 3.3: Several functionality improvements when paired with Pandas. For more information, see What's New in Spark 3.3. Additional optimizations developed on Amazon EMR. Upgrade to EMR File System (EMRFS) 2.53. Log4j 2 migration from Log4j 1.x Several Python module updates from Amazon Glue 3.0, such as an upgraded version of Boto. Upgrade of several connectors, including the default Amazon Redshift connector. See Appendix C: Connector upgrades. Upgrade of several JDBC drivers. See Appendix B: JDBC driver upgrades. Updated with a new Amazon Redshift connector and JDBC driver. Native support for open-data lake frameworks with Apache Hudi, Delta Lake, and Apache Iceberg. Native support for the Amazon S3-based Cloud Shuffle Storage Plugin (an Apache Spark plugin) to use Amazon S3 for shuffling and elastic storage capacity. Limitations The following are limitations with Amazon Glue 4.0: Amazon Glue machine learning and personally identifiable information (PII) transforms are not yet available in Amazon Glue 4.0. For more information about migrating to Amazon Glue version 4.0, see Migrating Amazon Glue for Spark jobs to Amazon Glue version 4.0.
Amazon Glue 4.0	Ray environment versions Ray 2.4.0 Python 3.9	N/A	Build and run distributed Python applications with Amazon Glue for Ray. Supports Ray-2.4.0 data distribution (`ray[data]`) with Python 3.9. For more information on this Ray release, see Ray-2.4.0 in the Ray GitHub repository. Supports installing additional Python libraries into the `Ray2.4` runtime environment. For more information, see Additional Python modules for Ray jobs. Integrates logs and metrics from Ray jobs with Amazon CloudWatch. For more information, see Troubleshooting Amazon Glue for Ray errors from logs and Monitoring Ray jobs with metrics. Aggregates and visualizes metrics for Ray jobs in Amazon Glue Studio, on each job run page. Supports distributing files to each working directory across your cluster, spilling objects from the Ray object store to Amazon S3, and controlling the minimum number of worker nodes allocated to your Ray job. For more information, see Using job parameters in Ray jobs. Limitations on Ray jobs in Amazon Glue 4.0 Amazon Glue interactive sessions for Ray remain in preview for this release. Amazon Glue for Ray integration with Amazon VPC is not currently available. Resources in a VPC in Amazon will not be accessible without a public route. For more information about using Amazon Glue with Amazon VPC, see Configuring interface VPC endpoints (Amazon PrivateLink) for Amazon Glue (Amazon PrivateLink). Amazon Glue for Ray is available in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).
Amazon Glue 3.0	Spark 3.1.1 Python 3.7	Java 8	In addition to the Spark engine upgrade to 3.0, there are optimizations and upgrades built into this Amazon Glue release, such as: Builds the Amazon Glue ETL Library against Spark 3.0, which is a major release for Spark. Streaming jobs are supported on Amazon Glue 3.0. Includes new Amazon Glue Spark runtime optimizations for performance and reliability: Faster in-memory columnar processing based on Apache Arrow for reading CSV data. SIMD-based execution for vectorized reads with CSV data. Spark upgrade also includes additional optimizations developed on Amazon EMR. Upgraded EMRFS from 2.38 to 2.46 enabling new features and bug fixes for Amazon S3 access. Upgraded several dependencies that were required for the new Spark version. Upgraded JDBC drivers for our natively supported data sources. Limitations The following are limitations with Amazon Glue 3.0: Amazon Glue machine learning transforms are not yet available in Amazon Glue 3.0. Some custom Spark connectors do not work with Amazon Glue 3.0 if they depend on Spark 2.4 and do not have compatibility with Spark 3.1.
Amazon Glue 2.0 (end of life on April 1, 2026)	Spark 2.4.3 Python 3.7	N/A	In addition to the features provided in Amazon Glue version 1.0, Amazon Glue version 2.0 also provides: An upgraded infrastructure for running Apache Spark ETL jobs in Amazon Glue with reduced startup times. Default logging is now real time, with separate streams for drivers and executors, and outputs and errors. Support for specifying additional Python modules or different versions at the job level. Note Amazon Glue version 2.0 differs from Amazon Glue version 1.0 for some dependencies and versions due to underlying architectural changes. Validate your Amazon Glue jobs before migrating across major Amazon Glue version releases.
Amazon Glue 1.0 (end of life on April 1, 2026)	Spark 2.4.3 Python 2.7 Python 3.6	N/A	You can maintain job bookmarks for Parquet and ORC formats in Amazon Glue ETL jobs (using Amazon Glue version 1.0). Previously, you were only able to bookmark common Amazon S3 source formats such as JSON, CSV, Apache Avro, and XML in Amazon Glue ETL jobs. When setting format options for ETL inputs and outputs, you can specify to use Apache Avro reader/writer format 1.8 to support Avro logical type reading and writing (using Amazon Glue version 1.0). Previously, only the version 1.7 Avro reader/writer format was supported. The DynamoDB connection type supports a writer option (using Amazon Glue version 1.0). Limitations The following are limitations with Amazon Glue 1.0: Amazon Glue versions 0.9 and 1.0 are not available in the Asia Pacific (Jakarta) (`ap-southeast-3`), Middle East (UAE) (`me-central-1`), or other new Regions going forward.
Amazon Glue 0.9 (end of life on April 1, 2026)	Spark 2.2.1 Python 2.7	N/A	Limitations The following are limitations with Amazon Glue 0.9: Amazon Glue versions 0.9 and 1.0 are not available in the Asia Pacific (Jakarta) (`ap-southeast-3`), Middle East (UAE) (`me-central-1`), or other new Regions going forward.

Note

The following Glue versions support these versions of PythonShell:

PythonShell v3.6 is supported in Glue version 1.0.
PythonShell v3.9 is supported in Glue version 3.0.

Additionally, dev endpoints are supported only in Glue version 1.0, and 0.9.

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Working with jobs

Amazon Glue version support policy

Amazon Glue versions