Amazon CloudFormation template for defining a Route 53 health check CloudWatch alarm properties Route 53 health check properties

Setting up the Route 53 health check for EventBridge global endpoints

When using global endpoints you have to have a Route 53 health check to monitor the status of your Regions. The following template defines a Amazon CloudWatch alarm and uses it to define a Route 53 health check.

Topics

Amazon CloudFormation template for defining a Route 53 health check
CloudWatch alarm template properties
Route 53 health check template properties

Amazon CloudFormation template for defining a Route 53 health check

Use the following template to define your Route 53 health check.


Description: |-
  Global endpoints health check that will fail when the average Amazon EventBridge 
  latency is above 30 seconds for a duration of 5 minutes. Note, missing data will 
  cause the health check to fail, so if you only send events intermittently, consider 
  changing the heath check to use a longer evaluation period or instead treat missing 
  data as 'missing' instead of 'breaching'.

Metadata:
  AWS::CloudFormation::Interface:
    ParameterGroups: 
      - Label: 
          default: "Global endpoint health check alarm configuration"
        Parameters:
          - HealthCheckName
          - HighLatencyAlarmPeriod
          - MinimumEvaluationPeriod
          - MinimumThreshold
          - TreatMissingDataAs
    ParameterLabels:
      HealthCheckName:
        default: Health check name
      HighLatencyAlarmPeriod:
        default: High latency alarm period
      MinimumEvaluationPeriod:
        default: Minimum evaluation period
      MinimumThreshold:
        default: Minimum threshold
      TreatMissingDataAs:
        default: Treat missing data as

Parameters:
  HealthCheckName:
    Description: Name of the health check
    Type: String
    Default: LatencyFailuresHealthCheck
  HighLatencyAlarmPeriod:
    Description: The period, in seconds, over which the statistic is applied. Valid values are 10, 30, 60, and any multiple of 60.
    MinValue: 10
    Type: Number
    Default: 60
  MinimumEvaluationPeriod:
    Description: The number of periods over which data is compared to the specified threshold. You must have at least one evaluation period.
    MinValue: 1
    Type: Number
    Default: 5
  MinimumThreshold:
    Description: The value to compare with the specified statistic.
    Type: Number
    Default: 30000
  TreatMissingDataAs:
    Description: Sets how this alarm is to handle missing data points.
    Type: String
    AllowedValues:
      - breaching
      - notBreaching
      - ignore
      - missing
    Default: breaching  

Mappings:
  "InsufficientDataMap":
    "missing":
      "HCConfig": "LastKnownStatus"
    "breaching":
      "HCConfig": "Unhealthy"  

Resources:
  HighLatencyAlarm:
      Type: AWS::CloudWatch::Alarm
      Properties:
        AlarmDescription: High Latency in Amazon EventBridge
        MetricName: IngestionToInvocationStartLatency
        Namespace: AWS/Events
        Statistic: Average
        Period: !Ref HighLatencyAlarmPeriod
        EvaluationPeriods: !Ref MinimumEvaluationPeriod
        Threshold: !Ref MinimumThreshold
        ComparisonOperator: GreaterThanThreshold
        TreatMissingData: !Ref TreatMissingDataAs

  LatencyHealthCheck:
      Type: AWS::Route53::HealthCheck
      Properties:
        HealthCheckTags:
          - Key: Name
            Value: !Ref HealthCheckName
        HealthCheckConfig:
          Type: CLOUDWATCH_METRIC
          AlarmIdentifier:
            Name:
              Ref: HighLatencyAlarm
            Region: !Ref AWS::Region
          InsufficientDataHealthStatus: !FindInMap [InsufficientDataMap, !Ref TreatMissingDataAs, HCConfig]

Outputs:
  HealthCheckId:
    Description: The identifier that Amazon Route 53 assigned to the health check when you created it.
    Value: !GetAtt LatencyHealthCheck.HealthCheckId

Event IDs can change across API calls so correlating events across Regions requires you to have an immutable, unique identifier. Consumers should also be designed with idempotency in mind. That way, if you're replicating events, or replaying them from archives, there are no side effects from the events being processed in both Regions.

CloudWatch alarm template properties

Note

For all editable fields, consider your throughput per second. If you only send events intermittently, consider changing the heath check to use a longer evaluation period or instead treat missing data as missing instead of breaching.

The following properties are used in th CloudWatch alarm section of the template:

Metric	Description
`AlarmDescription`	The description of the alarm. Default: `High Latency in Amazon EventBridge`
`MetricName`	The name of the metric associated with the alarm. This is required for an alarm based on a metric. For an alarm based on a math expression, you use `Metrics` instead and you can't specify `MetricName`. Default: IngestionToInvocationStartLatency
`Namespace`	The namespace of the metric associated with the alarm. This is required for an alarm based on a metric. For an alarm based on a math expression, you can't specify `Namespace` and you use `Metrics` instead. Default: `AWS/Events`
`Statistic`	The statistic for the metric associated with the alarm, other than percentile. Default: Average
`Period`	The period, in seconds, over which the statistic is applied. This is required for an alarm based on a metric. Valid values are 10, 30, 60, and any multiple of 60. Default: `60`
`EvaluationPeriods`	The number of periods over which data is compared to the specified threshold. If you are setting an alarm that requires that a number of consecutive data points be breaching to trigger the alarm, this value specifies that number. If you are setting an "M out of N" alarm, this value is the N, and `DatapointsToAlarm` is the M. Default: `5`
`Threshold`	The value to compare with the specified statistic. Default: `30,000`
`ComparisonOperator`	The arithmetic operation to use when comparing the specified statistic and threshold. The specified statistic value is used as the first operand. Default: `GreaterThanThreshold`
`TreatMissingData`	Sets how this alarm is to handle missing data points. Valid values: `breaching`, `notBreaching`, `ignore`, and `missing` Default: `breaching`

Route 53 health check template properties

Note

The following properties are used in th Route 53 health check section of the template:

Metric Description

Metric	Description
`HealthCheckName`	The name of the health check. Default: `LatencyFailuresHealthCheck`
`InsufficientDataHealthStatus`	When CloudWatch has insufficient data about the metric to determine the alarm state, the status that you want Amazon Route 53 to assign to the health check Valid values: `Healthy`: Route 53 considers the health check to be healthy. `Unhealthy`: Route 53 considers the health check to be unhealthy. `LastKnownStatus`: Route 53 uses the status of the health check from the last time that CloudWatch had sufficient data to determine the alarm state. For new health checks that have no last known status, the default status for the health check is healthy. Default: Unhealthy Note This field is updated based on the input to the `TreatMissingData` field. If `TreatingMissingData` is set to `Missing`, it will be updated to `LastKnownStatus`.If `TreatingMissingData` is set to `Breaching`, it will be updated to `Unhealthy`.

HealthCheckName

The name of the health check.

Default: LatencyFailuresHealthCheck

InsufficientDataHealthStatus

When CloudWatch has insufficient data about the metric to determine the alarm state, the status that you want Amazon Route 53 to assign to the health check

Valid values:

Healthy: Route 53 considers the health check to be healthy.
Unhealthy: Route 53 considers the health check to be unhealthy.
LastKnownStatus: Route 53 uses the status of the health check from the last time that CloudWatch had sufficient data to determine the alarm state. For new health checks that have no last known status, the default status for the health check is healthy.

Default: Unhealthy

Note

This field is updated based on the input to the TreatMissingData field. If TreatingMissingData is set to Missing, it will be updated to LastKnownStatus.If TreatingMissingData is set to Breaching, it will be updated to Unhealthy.

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Best practices

Logging event buses