Tracking real-time performance and availability in Amazon CloudWatch Internet Monitor (Overview tab) - Amazon CloudWatch
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Tracking real-time performance and availability in Amazon CloudWatch Internet Monitor (Overview tab)

Use the Overview tab in the CloudWatch console, under Internet Monitor, to get a high-level view of performance and availability for the traffic that your monitor tracks. The tab also displays an internet traffic overview map, with traffic clusters that can help you visualize your application's global traffic, and the location and impact of health events.

Health scores

The Health scores graph shows you performance and availability information for your global traffic. Amazon has substantial historical data about internet performance and availability for network traffic between geographic locations for different ASNs and Amazon services. Internet Monitor uses the connectivity data that Amazon has captured from its global networking footprint to calculate a baseline of performance and availability for internet traffic. This is the same data that we use at Amazon to monitor our own internet uptime and availability.

With those measurements as a baseline, Internet Monitor can detect when the performance and availability for your application has dropped, compared to the baseline. To make it easier to see those drops, we report that information to you as a performance score and an availability score. For more information, see Exploring your data with CloudWatch tools and the Internet Monitor query interface.

The Health scores graph includes health events that occurred during a time period that you choose. When there's a health event, you see a drop in the performance or availability line on the graph. If you select the event, you see more details and bands appear on the graph, with date and time information showing how long the event lasted.

You can also look at these metrics by accessing the log files directly for each data point. In the Actions menu, choose View CloudWatch Logs.

Internet traffic overview

The Internet traffic overview map shows you the internet traffic and health events that are specific to the locations and ASNs where your users access your application from. The countries that are gray on the map are those that include traffic for your application.

Each circle on the map indicates a health event in an area, for a time period that you select. Internet Monitor creates health events when it detects a problem, at a specific threshold, with connectivity between one of your resources hosted in Amazon and a city-network where a user is accessing your application. Choosing a circle on the map displays more details about the health event for that location. In addition, for clusters that have health events, you can see detailed information in the Health events table below the map.

Note that Internet Monitor creates health events in a monitor when it determines that an event has significant global impact on your application. If aren't any health events that exceed the threshold for impact on traffic for client locations in the time period that you've selected, the map is blank. For more information, see When Internet Monitor creates and resolves health events.

Change health event thresholds

You can configure several options around how and when Internet Monitor creates health events for your application. Choose Update thresholds to make changes.

You can change the overall threshold that triggers Internet Monitor to create a health event. The default health event threshold, for both performance scores and availability scores, is 95%. That is, when the overall performance or availability score for your application falls to 95% or below, Internet Monitor creates a health event. For the overall threshold, the health event can be triggered by a single large issue, or by the combination of multiple smaller issues.

You can also change the local—that is, city-network—threshold, combined with a percentage of the overall level of impact, that combined will trigger a health event. By setting a threshold that creates a health event when a score drops below the threshold for one or more city-networks (locations and ASNs, typically ISPs), you can get insights into when there are issues in locations with lower traffic, for example.

An additional local threshold option works together with the local threshold for availability or performance scores. The second factor is the percentage of your overall traffic that must be impacted before Internet Monitor creates a health event based on the local threshold.

By configuring the threshold options for overall traffic and local traffic, you can fine-tune how frequently health events are created, to align with your application usage and your needs. Be aware that when you set the local threshold to be lower, typically more health events are created, depending on your application and the other threshold configuration values that you set.

In summary, you can configure health event thresholds—for performance scores, availability scores, or both—in the following ways:

  • Choose different global thresholds for triggering a health event.

  • Choose different local thresholds for triggering a health event. With this option, you can also change the percentage of impact on your overall application that must be exceeded before Internet Monitor creates an event.

  • Choose to turn off triggering a health event based on local thresholds, or enable local threshold options.

You can also configure options for performance scores, availability scores, or both. You can configure a combination of the options, or just one of them.

To update thresholds and other configuration options for performance scores, availability scores, or both, do the following:

To change threshold configuration options
  1. In the Amazon Web Services Management Console, navigate to CloudWatch, and then, in the left navigation pane, choose Internet Monitor.

  2. On the Overview tab, in the Health events timeline section, choose Update thresholds.

  3. On the dialog page that opens, choose the new values and options that you want for thresholds and other options that trigger Internet Monitor to create a health event. You can do any of the following:

    • Choose a new value for Availability score threshold, Performance score threshold, or both.

      The graphs in the sections for each setting display the current threshold setting and the actual recent health event scores, for availability or performance, for your application. By viewing the typical values, you can get an idea of values that you might want to change a threshold to.

      Tip: To view a larger graph and change the timeframe, choose the expander in the upper right corner of the graph.

    • Choose to turn on or off a local threshold for availability or performance, or both. When an option is enabled, you can set the threshold and impact level for when you want Internet Monitor to create a health event.

  4. After you configure threshold options, save your updates by choosing Update health event thresholds.

To learn more about how health events work, see When Internet Monitor creates and resolves health events.

Health events table

The Health events table lists client locations that have been affected by health events, along with information about the events. The following columns are included in the table.

Description
Client location

The location of the end users who were impacted by the event, who experienced increased latency or reduced availability.

To learn more about client location accuracy in Internet Monitor, see Geolocation information and accuracy in Internet Monitor.

Traffic impact

How much impact was caused by the event, in increased latency or reduced availability. For latency, this is the percentage of how much latency increased during the event compared to typical performance for traffic, from this client location to this Amazon location using this client network.

Client network

The network that the traffic traveled over. Typically, this is the internet service provider (ISP) or Autonomous System Number (ASN) for the network traffic.

Amazon location

The Amazon location for the network traffic, which can be an Amazon Web Services Region or an internet edge location.

Impact type

The type of impact for the health event. Health events are typically caused by latency increases (performance issues) or reachability (availability issues).

You might also be able to click on the impact type to see the cause of the impairment. When possible, Internet Monitor analyzes the origin of a health event, to determine whether it was caused by Amazon or an ASN (internet service provider).

Note that this analysis continues after the event is resolved. Internet Monitor can update events with new information for up to an hour.

If you choose one of the client locations in the Health events table, you can see more details about the health event at that location. For example, you can see when the event started, when it ended, and the local traffic impact.

Network path visualization

Impairment analysis that is complete has a full network path under Network path visualization. The full path shows you each node along the network path for your application for the health event, between the Amazon location and the client, for a client-location pair.

If Internet Monitor determines the cause of an impairment, it's marked with a dashed red circle. Impairments can be caused by ASNs, typically internet service providers (ISPs), or the cause can be Amazon. If there were multiple causes for an impairment, multiple nodes are circled.