— data types —Schedule — operations —UpdateCrawlerSchedule (update_crawler_schedule)StartCrawlerSchedule (start_crawler_schedule)StopCrawlerSchedule (stop_crawler_schedule)

Crawler scheduler API

The Crawler scheduler API describes Amazon Glue crawler data types, along with the API for creating, deleting, updating, and listing crawlers.

Data types

Schedule structure

Schedule structure

A scheduling object using a cron statement to schedule an event.

Fields

ScheduleExpression – UTF-8 string.

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).
State – UTF-8 string (valid values: SCHEDULED | NOT_SCHEDULED | TRANSITIONING).

The state of the schedule.

Operations

UpdateCrawlerSchedule action (Python: update_crawler_schedule)
StartCrawlerSchedule action (Python: start_crawler_schedule)
StopCrawlerSchedule action (Python: stop_crawler_schedule)

UpdateCrawlerSchedule action (Python: update_crawler_schedule)

Updates the schedule of a crawler using a cron expression.

Request

CrawlerName – Required: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

The name of the crawler whose schedule to update.
Schedule – UTF-8 string.

The updated cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

Response

No Response parameters.

Errors

EntityNotFoundException
InvalidInputException
VersionMismatchException
SchedulerTransitioningException
OperationTimeoutException

StartCrawlerSchedule action (Python: start_crawler_schedule)

Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.

Request

CrawlerName – Required: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

Name of the crawler to schedule.

Response

No Response parameters.

Errors

EntityNotFoundException
SchedulerRunningException
SchedulerTransitioningException
NoScheduleException
OperationTimeoutException

StopCrawlerSchedule action (Python: stop_crawler_schedule)

Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.

Request

CrawlerName – Required: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

Name of the crawler whose schedule state to set.

Response

No Response parameters.

Errors

EntityNotFoundException
SchedulerNotRunningException
SchedulerTransitioningException
OperationTimeoutException

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Column statistics

Autogenerating ETL Scripts