Limitations
Consider the following limitations before you use data lake frameworks with Amazon Glue.
-
The following Amazon Glue
GlueContext
methods for DynamicFrame don't support reading and writing data lake framework tables. Use theGlueContext
methods for DataFrame or Spark DataFrame API instead.-
create_dynamic_frame.from_catalog
-
write_dynamic_frame.from_catalog
-
getDynamicFrame
-
writeDynamicFrame
-
-
The following
GlueContext
methods for DataFrame are supported with Lake Formation permission control:-
create_data_frame.from_catalog
-
write_data_frame.from_catalog
-
getDataFrame
-
writeDataFrame
-
-
Grouping small files is not supported.
-
Job bookmarks are not supported.
-
Apache Hudi 0.10.1 for Amazon Glue 3.0 doesn't support Hudi Merge on Read (MoR) tables.
-
ALTER TABLE … RENAME TO
is not available for Apache Iceberg 0.13.1 for Amazon Glue 3.0.
Limitations for data lake format tables managed by Lake Formation permissions
The data lake formats are integrated with Amazon Glue ETL via Lake Formation permissions. Creating a DynamicFrame using create_dynamic_frame
is not supported. For more information, see the following examples:
Note
The integration with Amazon Glue ETL via Lake Formation permissions for Apache Hudi, Apache Iceberg, and Delta Lake is supported only in Amazon Glue version 4.0.
Apache Iceberg has the best integration with Amazon Glue ETL via Lake Formation permissions. It supports almost all operations and includes SQL support.
Hudi supports most basic operations with the exception of administrative operations. This is because these options generally are done via writing of dataframes and specified via additional_options
. You need to use Amazon Glue APIs to create DataFrames for your operations as SparkSQL is not supported.
Delta Lake only supports the reading and appending and overwriting of table data. Delta Lake requires the use of their own libraries to be able to perform various tasks such as updates.
The following features are not available for Iceberg tables managed by Lake Formation permissions.
Compaction using Amazon Glue ETL
Spark SQL support via Amazon Glue ETL
The following are limitations of Hudi tables managed by Lake Formation permissions:
Removal of orphaned files
The following are limitations of Delta Lake tables managed by Lake Formation permissions:
All features other than inserting and reading from Delta Lake tables.