Amazon Athena CloudWatch Metrics connector
The Amazon Athena CloudWatch Metrics connector enables Amazon Athena to query CloudWatch Metrics data with SQL.
For information on publishing query metrics to CloudWatch from Athena itself, see Use CloudWatch and EventBridge to monitor queries and control costs.
Prerequisites
Deploy the connector to your Amazon Web Services account using the Athena console or the Amazon Serverless Application Repository. For more information, see Deploy a data source connector or Use the Amazon Serverless Application Repository to deploy a data source connector.
Parameters
Use the Lambda environment variables in this section to configure the CloudWatch Metrics connector.
-
spill_bucket – Specifies the Amazon S3 bucket for data that exceeds Lambda function limits.
-
spill_prefix – (Optional) Defaults to a subfolder in the specified
spill_bucket
calledathena-federation-spill
. We recommend that you configure an Amazon S3 storage lifecycle on this location to delete spills older than a predetermined number of days or hours. -
spill_put_request_headers – (Optional) A JSON encoded map of request headers and values for the Amazon S3
putObject
request that is used for spilling (for example,{"x-amz-server-side-encryption" : "AES256"}
). For other possible headers, see PutObject in the Amazon Simple Storage Service API Reference. -
kms_key_id – (Optional) By default, any data that is spilled to Amazon S3 is encrypted using the AES-GCM authenticated encryption mode and a randomly generated key. To have your Lambda function use stronger encryption keys generated by KMS like
a7e63k4b-8loc-40db-a2a1-4d0en2cd8331
, you can specify a KMS key ID. -
disable_spill_encryption – (Optional) When set to
True
, disables spill encryption. Defaults toFalse
so that data that is spilled to S3 is encrypted using AES-GCM – either using a randomly generated key or KMS to generate keys. Disabling spill encryption can improve performance, especially if your spill location uses server-side encryption.
The connector also supports AIMD
congestion controlThrottlingInvoker
construct. You can tweak the default throttling behavior
by setting any of the following optional environment variables:
-
throttle_initial_delay_ms – The initial call delay applied after the first congestion event. The default is 10 milliseconds.
-
throttle_max_delay_ms – The maximum delay between calls. You can derive TPS by dividing it into 1000ms. The default is 1000 milliseconds.
-
throttle_decrease_factor – The factor by which Athena reduces the call rate. The default is 0.5
-
throttle_increase_ms – The rate at which Athena decreases the call delay. The default is 10 milliseconds.
Databases and tables
The Athena CloudWatch Metrics connector maps your namespaces, dimensions, metrics, and metric
values into two tables in a single schema called default
.
The metrics table
The metrics
table contains the available metrics as uniquely defined
by a combination of namespace, set, and name. The metrics
table
contains the following columns.
-
namespace – A
VARCHAR
containing the namespace. -
metric_name – A
VARCHAR
containing the metric name. -
dimensions – A
LIST
ofSTRUCT
objects composed ofdim_name (VARCHAR)
anddim_value (VARCHAR)
. -
statistic – A
LIST
ofVARCH
statistics (for example,p90
,AVERAGE
, ...) available for the metric.
The metric_samples table
The metric_samples
table contains the available metric samples for
each metric in the metrics
table. The metric_samples
table
contains the following columns.
-
namespace – A
VARCHAR
that contains the namespace. -
metric_name – A
VARCHAR
that contains the metric name. -
dimensions – A
LIST
ofSTRUCT
objects composed ofdim_name (VARCHAR)
anddim_value (VARCHAR)
. -
dim_name – A
VARCHAR
convenience field that you can use to easily filter on a single dimension name. -
dim_value – A
VARCHAR
convenience field that you can use to easily filter on a single dimension value. -
period – An
INT
field that represents the "period" of the metric in seconds (for example, a 60 second metric). -
timestamp – A
BIGINT
field that represents the epoch time in seconds that the metric sample is for. -
value – A
FLOAT8
field that contains the value of the sample. -
statistic – A
VARCHAR
that contains the statistic type of the sample (for example,AVERAGE
orp90
).
Required Permissions
For full details on the IAM policies that this
connector requires, review the Policies
section of the athena-cloudwatch-metrics.yaml
-
Amazon S3 write access – The connector requires write access to a location in Amazon S3 in order to spill results from large queries.
-
Athena GetQueryExecution – The connector uses this permission to fast-fail when the upstream Athena query has terminated.
-
CloudWatch Metrics ReadOnly – The connector uses this permission to query your metrics data.
-
CloudWatch Logs Write – The connector uses this access to write its diagnostic logs.
Performance
The Athena CloudWatch Metrics connector attempts to optimize queries against CloudWatch Metrics by parallelizing scans of the log streams required for your query. For certain time period, metric, namespace, and dimension filters, predicate pushdown is performed both within the Lambda function and within CloudWatch Logs.
License information
The Amazon Athena CloudWatch Metrics connector project is licensed under the Apache-2.0 License
Additional resources
For additional information about this connector, visit the corresponding site