要获得与亚马逊 Timestream 类似的功能 LiveAnalytics,可以考虑适用于 InfluxDB 的亚马逊 Timestream。适用于 InfluxDB 的 Amazon Timestream 提供简化的数据摄取和个位数毫秒级的查询响应时间,以实现实时分析。点击此处了解更多信息。
本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
时间序列视图
Timestream for LiveAnalytics 支持以下用于将数据转换为timeseries数据类型的函数:
CREATE_TIME_SERIES
CREATE_TIME_SERIES 是一个聚合函数,接收时间序列的所有原始测量值(时间和度量值),并返回时间序列数据类型。此函数的语法如下:
CREATE_TIME_SERIES(time, measure_value::<data_type>)
其中,<data_type> 是度量值的数据类型,可以是 bigint、boolean、double 或 varchar 其中之一。第二个参数不能为 null。
考虑存储在名为 m etrics 的表中的 EC2 实例的 CPU 使用率,如下所示:
| 时间 | 区域 | az | vpc | instance_id | measure_name | measure_value::double |
|---|---|---|---|---|---|---|
|
2019-12-04 19:00:00.000000000 |
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef0 |
cpu_utilization |
35.0 |
|
2019-12-04 19:00:01.000000000 |
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef0 |
cpu_utilization |
38.2 |
|
2019-12-04 19:00:02.000000000 |
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef0 |
cpu_utilization |
45.3 |
|
2019-12-04 19:00:00.000000000 |
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef1 |
cpu_utilization |
54.1 |
|
2019-12-04 19:00:01.000000000 |
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef1 |
cpu_utilization |
42.5 |
|
2019-12-04 19:00:02.000000000 |
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef1 |
cpu_utilization |
33.7 |
运行查询:
SELECT region, az, vpc, instance_id, CREATE_TIME_SERIES(time, measure_value::double) as cpu_utilization FROM metrics WHERE measure_name=’cpu_utilization’ GROUP BY region, az, vpc, instance_id
将返回所有以 cpu_utilization 作为度量值的序列。本例中包含两个序列:
| 区域 | az | vpc | instance_id | cpu_utilization |
|---|---|---|---|---|
|
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef0 |
[{time: 2019-12-04 19:00:00.000000000, measure_value::double: 35.0}, {time: 2019-12-04 19:00:01.000000000, measure_value::double: 38.2}, {time: 2019-12-04 19:00:02.000000000, measure_value::double: 45.3}] |
|
us-east-1 |
us-east-1d |
vpc-1a2b3c4d |
i-1234567890abcdef1 |
[{time: 2019-12-04 19:00:00.000000000, measure_value::double: 35.1}, {time: 2019-12-04 19:00:01.000000000, measure_value::double: 38.5}, {time: 2019-12-04 19:00:02.000000000, measure_value::double: 45.7}] |
UNNEST
UNNEST 是表函数,可让您将 timeseries 数据转换为平面模型。语法如下:
UNNEST 将 timeseries 转换为两列,即 time 和 value。也可以在 UNNEST 中使用别名,如下所示:
UNNEST(timeseries) AS<alias_name>(time_alias,value_alias)
其中 <alias_name> 是平面表的别名,time_alias 是 time 列的别名,value_alias 是 value 列的别名。
例如,假设队列中的一些 EC2 实例配置为以 5 秒的间隔发布指标,而另一些实例则以 15 秒的间隔发布指标,并且您需要在过去 6 小时内以 10 秒为粒度发布所有实例的平均指标。要获取此数据,请使用 CREATE_TIME_SERIES 将指标转换为时间序列模型。然后,可使用 INTERPOLATE_LINEAR,并以 10 秒为粒度获取缺失值。下一步,使用 UNNEST 将数据转换回平面模型,然后使用 AVG 获取所有实例的平均指标数。
WITH interpolated_timeseries AS ( SELECT region, az, vpc, instance_id, INTERPOLATE_LINEAR( CREATE_TIME_SERIES(time, measure_value::double), SEQUENCE(ago(6h), now(), 10s)) AS interpolated_cpu_utilization FROM timestreamdb.metrics WHERE measure_name= ‘cpu_utilization’ AND time >= ago(6h) GROUP BY region, az, vpc, instance_id ) SELECT region, az, vpc, instance_id, avg(t.cpu_util) FROM interpolated_timeseries CROSS JOIN UNNEST(interpolated_cpu_utilization) AS t (time, cpu_util) GROUP BY region, az, vpc, instance_id
上述查询演示如何使用带别名的 UNNEST。以下是相同查询的示例,但未使用 UNNEST 的别名:
WITH interpolated_timeseries AS ( SELECT region, az, vpc, instance_id, INTERPOLATE_LINEAR( CREATE_TIME_SERIES(time, measure_value::double), SEQUENCE(ago(6h), now(), 10s)) AS interpolated_cpu_utilization FROM timestreamdb.metrics WHERE measure_name= ‘cpu_utilization’ AND time >= ago(6h) GROUP BY region, az, vpc, instance_id ) SELECT region, az, vpc, instance_id, avg(value) FROM interpolated_timeseries CROSS JOIN UNNEST(interpolated_cpu_utilization) GROUP BY region, az, vpc, instance_id