Monitoring concurrency - Amazon Lambda
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Monitoring concurrency

Lambda emits Amazon CloudWatch metrics to help you monitor concurrency for your functions. This topic explains these metrics and how to interpret them.

General concurrency metrics

Use the following metrics to monitor concurrency for your Lambda functions. The granularity for each metric is 1 minute.

  • ConcurrentExecutions – The number of active concurrent invocations at a given point in time. Lambda emits this metric for all functions, versions, and aliases. For any function in the Lambda console, Lambda displays the graph for ConcurrentExecutions natively in the Monitoring tab, under Metrics. View this metric using MAX.

  • UnreservedConcurrentExecutions – The number of active concurrent invocations that are using unreserved concurrency. Lambda emits this metric across all functions in a region. View this metric using MAX.

Provisioned concurrency metrics

Use the following metrics to monitor Lambda functions using provisioned concurrency. The granularity for each metric is 1 minute.

  • ProvisionedConcurrentExecutions – The number of execution environment instances that are actively processing an invocation on provisioned concurrency. Lambda emits this metric for each function version and alias with provisioned concurrency configured. View this metric using MAX.

ProvisionedConcurrentExecutions is not the same as the total number of provisioned concurrency that you allocate. For example, suppose you allocate 100 units of provisioned concurrency to a function version. During any given minute, if at most 50 out of those 100 execution environments were handling invocations simultaneously, then the value of MAX(ProvisionedConcurrentExecutions) is 50.

  • ProvisionedConcurrentInvocations – The number of times Lambda invokes your function code using provisioned concurrency. Lambda emits this metric for each function version and alias with provisioned concurrency configured. View this metric using SUM.

ProvisionedConcurrentInvocations differs from ProvisionedConcurrentExecutions in that ProvisionedConcurrentInvocations counts total number of invocations, while ProvisionedConcurrentExecutions counts number of active environments. To understand this distinction, consider the following scenario:


        A graphic distinguishing between ProvisionedConcurrentInvocations and
          ProvisionedConcurrentExecutions.

In this example, suppose that you receive 1 invocation per minute, and each invocation takes 2 minutes to complete. Each orange horizontal bar represents a single request. Suppose that you allocate 10 units of provisioned concurrency to this function, such that each request runs on provisioned concurrency.

In between minutes 0 and 1, Request 1 comes in. At minute 1, the value for MAX(ProvisionedConcurrentExecutions) is 1, since at most 1 execution environment was active during the past minute. The value for SUM(ProvisionedConcurrentInvocations) is also 1, since 1 new request came in during the past minute.

In between minutes 1 and 2, Request 2 comes in, and Request 1 continues to run. At minute 2, the value for MAX(ProvisionedConcurrentExecutions) is 2, since at most 2 execution environments were active during the past minute. However, the value for SUM(ProvisionedConcurrentInvocations) is 1, since only 1 new request came in during the past minute. This metric behavior continues until the end of the example.

  • ProvisionedConcurrencySpilloverInvocations – The number of times Lambda invokes your function on standard (reserved or unreserved) concurrency when all provisioned concurrency is in use. Lambda emits this metric for each function version and alias with provisioned concurrency configured. View this metric using SUM. The value of ProvisionedConcurrentInvocations + ProvisionedConcurrencySpilloverInvocations should be equal to the total number of function invocations (i.e. the Invocations metric).

    ProvisionedConcurrencyUtilization – The percentage of provisioned concurrency in use (i.e. the value of ProvisionedConcurrentExecutions divided by the total amount of provisioned concurrency allocated). Lambda emits this metric for each function version and alias with provisioned concurrency configured. View this metric using MAX.

For example, suppose you provision 100 units of provisioned concurrency to a function version. During any given minute, if at most 60 out of those 100 execution environments were handling invocations simultaneously, then the value of MAX(ProvisionedConcurrentExecutions) is 60, and the value of MAX(ProvisionedConcurrentUtilization) is 0.6.

A high value for ProvisionedConcurrencySpilloverInvocations may indicate that you need to allocate additional provisioned concurrency for your function. Alternatively, you can configure Application Auto Scaling to handle automatic scaling of provisioned concurrency based on pre-defined thresholds.

Conversely, consistently low values for ProvisionedConcurrencyUtilization may indicate that you over-allocated provisioned concurrency for your function.