使用 Amazon Ingestion OpenSearch 进行异常检测 - 亚马逊 OpenSearch 服务
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用 Amazon Ingestion OpenSearch 进行异常检测

您可以使用 Amazon OpenSearch Ingestion 近乎实时地训练模型并生成时间序列聚合事件的异常。您可以根据管道内生成的事件生成异常,也可以针对直接进入管道的事件(例如 OpenTelemetry 指标)生成异常。

您可以将这些滚动窗口聚合时间序列事件传输到异常探测器处理器,该处理器将训练模型、生成异常并显示等级分数。然后,将异常写入单独的索引,以创建文档监视器并触发快速警报。

除以上示例之外,您还可以使用记录到指标异常管道跟踪到指标异常管道蓝图。有关蓝图的更多信息,请参阅 使用蓝图创建管道

来自日志的指标

以下管道通过 HTTP 源接收日志 FluentBit,例如,通过将log密钥中的值与 grok 常见 Apache 日志模式进行匹配来从日志中提取重要值,然后将 grokked 的日志转发到log-to-metrics-pipeline子管道以及名为的索引。 OpenSearch logs

log-to-metrics-pipeline 子管道接收来自 apache-log-pipeline-with-metrics 子管道的 grok 日志,聚合日志,然后根据 clientiprequest 密钥值派生直方图指标。然后,它将直方图指标发送到名为histogram_metrics的 OpenSearch索引以及log-to-metrics-anomaly-detector子管道。

log-to-metrics-anomaly-detector-pipeline 子管道接收来自 log-to-metrics-pipeline 子管道的聚合直方图指标,将其发送到异常探测器处理器,以便使用随机森林砍伐算法检测异常。如果它检测到异常,它会将其发送到名为的 OpenSearch 索引。log-metric-anomalies

version: "2" apache-log-pipeline-with-metrics: source: http: # Provide the path for ingestion. ${pipelineName} will be replaced with pipeline name configured for this pipeline. # In this case it would be "/apache-log-pipeline-with-metrics/logs". This will be the FluentBit output URI value. path: "/${pipelineName}/logs" processor: - grok: match: log: [ "%{COMMONAPACHELOG_DATATYPED}" ] sink: - opensearch: ... index: "logs" - pipeline: name: "log-to-metrics-pipeline" log-to-metrics-pipeline: source: pipeline: name: "apache-log-pipeline-with-metrics" processor: - aggregate: # Specify the required identification keys identification_keys: ["clientip", "request"] action: histogram: # Specify the appropriate values for each the following fields key: "bytes" record_minmax: true units: "bytes" buckets: [0, 25000000, 50000000, 75000000, 100000000] # Pick the required aggregation period group_duration: "30s" sink: - opensearch: ... index: "histogram_metrics" - pipeline: name: "log-to-metrics-anomaly-detector-pipeline" log-to-metrics-anomaly-detector-pipeline: source: pipeline: name: "log-to-metrics-pipeline" processor: - anomaly_detector: # Specify the key on which to run anomaly detection keys: [ "bytes" ] mode: random_cut_forest: sink: - opensearch: ... index: "log-metric-anomalies"

来自跟踪的指标

您可以从跟踪中派生指标,并在这些生成的指标中查找异常。在此示例中,entry-pipeline子管道从 OpenTelemetry 收集器接收跟踪数据并将其转发到以下子管道:

  • span-pipeline— 从跟踪中提取原始 span。它将原始跨度发送到任何 OpenSearch 前缀为的索引。otel-v1-apm-span

  • service-map-pipeline – 执行聚合和分析,以创建表示服务间连接的文档。它将这些文档发送到名为的 OpenSearch 索引otel-v1-apm-service-map。然后,您可以通过 OpenSearch 仪表板的 Trace Analytics 插件查看服务地图的可视化效果。

  • trace-to-metrics-pipeline - 根据 serviceName 值聚合并从跟踪中派生直方图指标。然后,它将派生的 OpenSearch 指标发送到名为metrics_for_traces的索引以及trace-to-metrics-anomaly-detector-pipeline子管道。

trace-to-metrics-anomaly-detector-pipeline 子管道接收来自 trace-to-metrics-pipeline 的聚合直方图指标,将其发送到异常探测器处理器,以便使用随机森林砍伐算法检测异常。如果它检测到任何异常,它会将其发送到名为的 OpenSearch索引。trace-metric-anomalies

version: "2" entry-pipeline: source: otel_trace_source: # Provide the path for ingestion. ${pipelineName} will be replaced with pipeline name configured for this pipeline. # In this case it would be "/entry-pipeline/v1/traces". This will be endpoint URI path in OpenTelemetry Exporter # configuration. # path: "/${pipelineName}/v1/traces" processor: - trace_peer_forwarder: sink: - pipeline: name: "span-pipeline" - pipeline: name: "service-map-pipeline" - pipeline: name: "trace-to-metrics-pipeline" span-pipeline: source: pipeline: name: "entry-pipeline" processor: - otel_trace_raw: sink: - opensearch: ... index_type: "trace-analytics-raw" service-map-pipeline: source: pipeline: name: "entry-pipeline" processor: - service_map: sink: - opensearch: ... index_type: "trace-analytics-service-map" trace-to-metrics-pipeline: source: pipeline: name: "entry-pipeline" processor: - aggregate: # Pick the required identification keys identification_keys: ["serviceName"] action: histogram: # Pick the appropriate values for each the following fields key: "durationInNanos" record_minmax: true units: "seconds" buckets: [0, 10000000, 50000000, 100000000] # Pick the required aggregation period group_duration: "30s" sink: - opensearch: ... index: "metrics_for_traces" - pipeline: name: "trace-to-metrics-anomaly-detector-pipeline" trace-to-metrics-anomaly-detector-pipeline: source: pipeline: name: "trace-to-metrics-pipeline" processor: - anomaly_detector: # Below Key will find anomalies in the max value of histogram generated for durationInNanos. keys: [ "max" ] mode: random_cut_forest: sink: - opensearch: ... index: "trace-metric-anomalies"

OpenTelemetry 指标

您可以创建一个管道来接收 OpenTelemetry 指标并检测这些指标中的异常。在此示例中,从 OpenTelemetry 收集器entry-pipeline接收指标数据。如果指标类型为 GAUGE 且指标名称为 totalApiBytesSent,则处理器会将其发送到 ad-pipeline 子管道。

ad-pipeline 子管道接收来自入口管道的指标数据,并使用异常探测器处理器基于指标值执行异常检测。

entry-pipeline: source: otel_metrics_source: processor: - otel_metrics: route: - gauge_route: '/kind = "GAUGE" and /name = "totalApiBytesSent"' sink: - pipeline: name: "ad-pipeline" routes: - gauge_route - opensearch: ... index: "otel-metrics" ad-pipeline: source: pipeline: name: "entry-pipeline" processor: - anomaly_detector: # Use "value" as the key on which anomaly detector needs to be run keys: [ "value" ] mode: random_cut_forest: sink: - opensearch: ... index: otel-metrics-anomalies

除此示例之外,您也可以使用跟踪到指标异常管道蓝图。有关蓝图的更多信息,请参阅 使用蓝图创建管道