Synchronize Delta Lake metadata

Athena synchronizes table metadata, including schema, partition columns, and table properties, to Amazon Glue if you use Athena to create your Delta Lake table. As time passes, this metadata can lose its synchronization with the underlying table metadata in the transaction log. To keep your table up to date, you can choose one of the following options:

Use the Amazon Glue crawler for Delta Lake tables. For more information, see Introducing native Delta Lake table support with Amazon Glue crawlers in the Amazon Big Data Blog and Scheduling an Amazon Glue crawler in the Amazon Glue Developer Guide.
Drop and recreate the table in Athena.
Use the SDK, CLI, or Amazon Glue console to manually update the schema in Amazon Glue.

Note that the following features require your Amazon Glue schema to always have the same schema as the transaction log:

Lake Formation
Views
Row and column filters

If your workflow does not require any of this functionality, and you prefer not to maintain this compatibility, you can use CREATE TABLE DDL in Athena and then add the Amazon S3 path as a SerDe parameter in Amazon Glue.

You can use the following procedure to create a Delta Lake table with the Athena and Amazon Glue consoles.

To create a Delta Lake table using the Athena and Amazon Glue consoles

Open the Athena console at https://console.amazonaws.cn/athena/.
In the Athena query editor, use the following DDL to create your Delta Lake table. Note that when using this method, the value for TBLPROPERTIES must be 'spark.sql.sources.provider' = 'delta' and not 'table_type' = 'delta'.

Note that this same schema (with a single of column named col of type array<string>) is inserted when you use Apache Spark (Athena for Apache Spark) or most other engines to create your table.
```
CREATE EXTERNAL TABLE
   [db_name.]table_name(col array<string>)
   LOCATION 's3://amzn-s3-demo-bucket/your-folder/'
   TBLPROPERTIES ('spark.sql.sources.provider' = 'delta')
```
Open the Amazon Glue console at https://console.amazonaws.cn/glue/.
In the navigation pane, choose Data Catalog, Tables.
In the list of tables, choose the link for your table.
On the page for the table, choose Actions, Edit table.
In the Serde parameters section, add the key path with the value s3://amzn-s3-demo-bucket/your-folder/.
Choose Save.

To create a Delta Lake table using the Amazon CLI, enter a command like the following.


aws glue create-table --database-name dbname \
    --table-input '{"Name" : "tablename", "StorageDescriptor":{
            "Columns" : [
                {
                    "Name": "col",
                    "Type": "array<string>"
                }
            ],
            "Location" : "s3://amzn-s3-demo-bucket/<prefix>/",
            "SerdeInfo" : {
                "Parameters" : {
                    "serialization.format" : "1",
                    "path" : "s3://amzn-s3-demo-bucket/<prefix>/"
                }
            }
        },
        "PartitionKeys": [],
        "TableType": "EXTERNAL_TABLE",
        "Parameters": {
            "EXTERNAL": "TRUE",
            "spark.sql.sources.provider": "delta"
        }
    }'

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Query with SQL

Additional resources