Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions,
see Getting Started with Amazon Web Services in China
(PDF).
Enabling compaction
You can use Lake Formation console, Amazon Glue console, Amazon CLI, or Amazon API to enable compaction for your Apache Iceberg tables in the Data Catalog.
For new tables, you can choose Apache Iceberg as table format and enable compaction when you create the table.
Compaction is disabled by default for new tables.
- Console
-
To enable compaction
-
Open the Lake Formation console at https://console.amazonaws.cn/lakeformation/ and sign in as a data lake administrator, the table creator, or a user who has been granted
the glue:UpdateTable
and lakeformation:GetDataAccess
permissions on the table.
-
In the navigation pane, under Data Catalog, choose Tables.
On the Tables page, choose a table in open table format that you want to
enable compaction for, then under Actions menu, choose
Enable compaction.
-
You can also enable compaction by selecting the table and opening the
Table details page. Choose the Table
optimization tab on the lower section of the page, and choose
Enable compaction.
-
Next, select an existing IAM role from the drop down with the permissions
shown in the
Table optimization prerequisites
section.
When you choose Create a new IAM role option, the service creates a custom role with the required permissions to run compaction.
Follow the steps below to update an existing IAM role:
-
To update the permissions policy for the IAM role, in the IAM console, go to the IAM role that is being used for running compaction.
-
In the Add permissions section, choose Create policy. In the newly opened browser window, create a new policy to use with your role.
On the Create policy page, choose the JSON tab. Copy the JSON code shown in the Prerequisites into the policy editor field.
- Amazon CLI
-
The following example shows how to enable compaction. Replace the account ID with a valid Amazon account ID.
Replace the database name and table name with actual Iceberg table name and the database name. Replace the roleArn
with the Amazon Resource Name (ARN) of the IAM role and name of the IAM role
that has the required permissions to run compaction.
aws glue create-table-optimizer \
--catalog-id 123456789012
\
--database-name iceberg_db
\
--table-name iceberg_table
\
--table-optimizer-configuration '{"roleArn":"arn:aws:iam::123456789012
:role/compaction_role
", "enabled":'true'}' \
--type compaction
- Amazon API
-
Call CreateTableOptimizer
operation to enable compaction for a table.
After you enable compaction, Table optimization tab shows the
following compaction details (after approximately 15-20 minutes):
- Start time
-
The time at which the compaction process started within Lake Formation. The value is a timestamp in UTC time.
- End time
-
The time at which the compaction process ended in Data Catalog. The value is a timestamp in UTC time.
- Status
-
The status of the compaction run. Values are success or fail.
- Files compacted
Total number of files compacted.
- Bytes compacted
-
Total number of bytes compacted.