Query Apache Iceberg tables
You can use Athena to perform read, time travel, write, and DDL queries on Apache Iceberg tables. The Iceberg tables must use the Apache Parquet format for data and the Amazon Glue catalog for their metastore.
Apache Iceberg
For more information about Apache Iceberg, see https://iceberg.apache.org/
Considerations and limitations
Athena support for Iceberg tables has the following considerations and limitations:
-
Iceberg version support – Athena supports Apache Iceberg version 1.4.2.
-
Tables with Amazon Glue catalog only – Only Iceberg tables created against the Amazon Glue catalog based on specifications defined by the open source glue catalog implementation
are supported from Athena. -
Table locking support by Amazon Glue only – Unlike the open source Glue catalog implementation, which supports plug-in custom locking, Athena supports Amazon Glue optimistic locking only. Using Athena to modify an Iceberg table with any other lock implementation will cause potential data loss and break transactions.
-
Supported file formats – Iceberg file format support in Athena depends on the Athena engine version, as shown in the following table.
-
Iceberg Restricted Metadata – Lake Formation does not evaluate the Iceberg metadata tables. Hence, the Iceberg metadata tables are restricted if there are any Lake Formation row or cell filters present on the base table or if you do not have permissions to view all columns in the base table. For such cases, when you query the
$partitions
,$files
,$manifests
, and$snapshots
Iceberg metadata tables, it fails and you get anAccessDeniedException
error. Additionally, the metadata column$path
has the same Lake Formation restrictions and fails when selected by the query. All other metadata tables can be queried regardless of the Lake Formation filters. For more information, see Metadata tables. Athena engine version Parquet ORC Avro 2 Yes No No 3 Yes Yes Yes -
Iceberg v2 tables – Athena only creates and operates on Iceberg v2 tables. For the difference between v1 and v2 tables, see Format version changes
in the Apache Iceberg documentation. -
Display of time types without time zone – The time and timestamp without time zone types are displayed in UTC. If the time zone is unspecified in a filter expression on a time column, UTC is used.
-
Timestamp related data precision – Although Iceberg supports microsecond precision for the timestamp data type, Athena supports only millisecond precision for timestamps in both reads and writes. For data in time related columns that is rewritten during manual compaction operations, Athena retains only millisecond precision.
-
Unsupported operations – The following Athena operations are not supported for Iceberg tables.
-
Views – Use
CREATE VIEW
to create Athena views as described in Work with views. If you are interested in using the Iceberg view specificationto create views, contact athena-feedback@amazon.com . -
TTF management commands not supported in Amazon Lake Formation – Although you can use Lake Formation to manage read access permissions for TransactionTable Formats (TTFs) like Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake, you cannot use Lake Formation to manage permissions for operations like
VACUUM
,MERGE
,UPDATE
orOPTIMIZE
with these table formats. For more information about Lake Formation integration with Athena, see Using Amazon Lake Formation with Amazon Athena in the Amazon Lake Formation Developer Guide. -
Partitioning by nested fields – Partitioning by nested fields is not supported. Attempting to do so produces the message
NOT_SUPPORTED: Partitioning by nested field is unsupported:
column_name
.nested_field_name
. -
Skipping S3 Glacier objects not supported – If objects in the Apache Iceberg table are in an Amazon S3 Glacier storage class, setting the
read_restored_glacier_objects
table property tofalse
has no effect.For example, suppose you issue the following command:
ALTER TABLE
table_name
SET TBLPROPERTIES ('read_restored_glacier_objects' = 'false')For Iceberg and Delta Lake tables, the command produces the error
Unsupported table property key: read_restored_glacier_objects
. For Hudi tables, theALTER TABLE
command does not produce an error, but Amazon S3 Glacier objects are still not skipped. RunningSELECT
queries after theALTER TABLE
command continues to return all objects.
If you would like Athena to support a particular feature, send feedback to athena-feedback@amazon.com
Topics
- Create Iceberg tables
- Query Iceberg table data
- Perform time travel and version travel queries
- Update Iceberg table data
- Manage Iceberg tables
- Evolve Iceberg table schema
- Perform other DDL operations on Iceberg tables
- Optimize Iceberg tables
- Supported data types for Iceberg tables in Athena
- Additional resources