Schedule a crawler to keep the Amazon Glue Data Catalog and Amazon S3 in sync
Amazon Glue crawlers can be set up to run on a schedule or on demand. For more information, see Time-based schedules for jobs and crawlers in the Amazon Glue Developer Guide.
If you have data that arrives for a partitioned table at a fixed time, you can set up
an Amazon Glue crawler to run on schedule to detect and update table partitions. This can
eliminate the need to run a potentially long and expensive MSCK REPAIR
command or manually run an ALTER TABLE ADD PARTITION
command. For more
information, see Table
partitions in the Amazon Glue Developer Guide.