Automating Amazon Glue with EventBridge - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Automating Amazon Glue with EventBridge

You can use Amazon EventBridge to automate your Amazon services and respond automatically to system events such as application availability issues or resource changes. Events from Amazon services are delivered to EventBridge in near real time. You can write simple rules to indicate which events are of interest to you, and what automated actions to take when an event matches a rule. The actions that can be automatically triggered include the following:

  • Invoking an Amazon Lambda function

  • Invoking Amazon EC2 Run Command

  • Relaying the event to Amazon Kinesis Data Streams

  • Activating an Amazon Step Functions state machine

  • Notifying an Amazon SNS topic or an Amazon SQS queue

Some examples of using EventBridge with Amazon Glue include the following:

  • Activating a Lambda function when an ETL job succeeds

  • Notifying an Amazon SNS topic when an ETL job fails

The following EventBridge are generated by Amazon Glue.

  • Events for "detail-type":"Glue Job State Change" are generated for SUCCEEDED, FAILED, TIMEOUT, and STOPPED.

  • Events for "detail-type":"Glue Job Run Status" are generated for RUNNING, STARTING, and STOPPING job runs when they exceed the job delay notification threshold. You must set the job delay notification threshold property to receive these events.

    Only one event is generated per job run status when the job delay notification threshold is exceeded.

  • Events for "detail-type":"Glue Crawler State Change" are generated for Started, Succeeded, and Failed.

  • Events for “detail_type”:“Glue Scheduled Crawler Invocation Failure” are generated when the scheduled crawler fails to start. In the details of the notification:

    • customerId contains the account ID of the customer.

    • crawlerName contains the name of the crawler that failed to start.

    • errorMessage contains the exception message of the invocation failure.

  • Events for “detail_type”:“Glue Auto Statistics Invocation Failure“ are generated when the auto-managed column statistics task run fails to start. In the details of the notification:

    • catalogId contains the ID associated with a catalog.

    • databaseName contains the name of the affected database.

    • tableName contains the name of the affected table.

    • errorMessage contains the exception message of the invocation failure.

  • Events for “detail_type”:“Glue Scheduled Statistics Invocation Failure” are generated when the (cron) scheduled column statistics task run fails to start. In the details of the notification:

    • catalogId contains the ID associated with a catalog.

    • databaseName contains the name of the affected database.

    • tableName contains the name of the affected table.

    • errorMessage contains the exception message of the invocation failure.

  • Events for “detail_type”:“Glue Statistics Task Started” are generated when the column statistics task run starts.

  • Events for “detail_type”:“Glue Statistics Task Succeeded” are generated when the column statistics task run succeeds.

  • Events for “detail_type”:“Glue Statistics Task Failed” are generated when the column statistics task run fails.

  • Events for "detail-type":"Glue Data Catalog Database State Change" are generated for CreateDatabase, DeleteDatabase, CreateTable, DeleteTable and BatchDeleteTable. For example, if a table is created or deleted, a notification is sent to EventBridge. Note that you cannot write a program that depends on the order or existence of notification events, as they might be out of sequence or missing. Events are emitted on a best effort basis. In the details of the notification:

    • The typeOfChange contains the name of the API operation.

    • The databaseName contains the name of the affected database.

    • The changedTables contains up to 100 names of affected tables per notification. When table names are long, multiple notifications might be created.

  • Events for "detail-type":"Glue Data Catalog Table State Change" are generated for UpdateTable, CreatePartition, BatchCreatePartition, UpdatePartition, DeletePartition, BatchUpdatePartition and BatchDeletePartition. For example, if a table or partition is updated, a notification is sent to EventBridge. Note that you cannot write a program that depends on the order or existence of notification events, as they might be out of sequence or missing. Events are emitted on a best effort basis. In the details of the notification:

    • The typeOfChange contains the name of the API operation.

    • The databaseName contains the name of the database that contains the affected resources.

    • The tableName contains the name of the affected table.

    • The changedPartitions specifies up to 100 affected partitions in one notification. When partition names are long, multiple notifications might be created.

      For example if there are two partition keys, Year and Month, then "2018,01", "2018,02" modifies the partition where "Year=2018" and "Month=01" and the partition where "Year=2018" and "Month=02".

      { "version":"0", "id":"abcdef00-1234-5678-9abc-def012345678", "detail-type":"Glue Data Catalog Table State Change", "source":"aws.glue", "account":"123456789012", "time":"2017-09-07T18:57:21Z", "region":"us-west-2", "resources":["arn:aws:glue:us-west-2:123456789012:database/default/foo"], "detail":{ "changedPartitions": [ "2018,01", "2018,02" ], "databaseName": "default", "tableName": "foo", "typeOfChange": "BatchCreatePartition" } }

For more information, see the Amazon CloudWatch Events User Guide. For events specific to Amazon Glue, see Amazon Glue Events.