Deleting orphan files - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Deleting orphan files

Amazon Glue Data Catalog allows you to remove orphan files from your Iceberg tables. Orphan files are data or metadata files that are no longer tracked by the Iceberg table metadata, but still exist in the Amazon S3 data source. These orphan files can accumulate over time due to operations like compaction, partition drops, or table rewrites, and take up unnecessary storage space.

The orphan file deletion optimizer in Amazon Glue scans the table metadata and the actual data files, identifies the orphan files, and deletes them to reclaim storage space.

You can initiate the orphan file deletion by creating an orphan file deletion table optimizer in the Data Catalog.