使用 Amazon OpenSearch Ingestion 执行异常检测 - 亚马逊 OpenSearch 服务
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用 Amazon OpenSearch Ingestion 执行异常检测

您可以使用 Amazon OpenSearch Ingestion 基于时间序列聚合事件近实时训练模型并生成异常。您可以基于管道内生成的事件或直接传入管道的事件(例如,OpenTelemetry 指标)生成异常。

您可以将这些滚动窗口聚合时间序列事件传输到异常探测器处理器,该处理器将训练模型、生成异常并显示等级分数。然后,将异常写入单独的索引,以创建文档监视器并触发快速警报。

除以上示例之外,您还可以使用记录到指标异常管道跟踪到指标异常管道蓝图。有关蓝图的更多信息,请参阅 使用蓝图创建管道

来自日志的指标

以下管道通过 HTTP 源(例如,FluentBit)接收日志,通过匹配 log 密钥值与 Grok 通用 Apache 日志模式从日志中提取重要值,然后将 grok 日志转发到 log-to-metrics-pipeline 子管道以及名为 logs 的 OpenSearch 索引。

log-to-metrics-pipeline 子管道接收来自 apache-log-pipeline-with-metrics 子管道的 grok 日志,聚合日志,然后根据 clientiprequest 密钥值派生直方图指标。然后,将直方图指标发送给名为 histogram_metrics 的 OpenSearch 索引以及 log-to-metrics-anomaly-detector 子管道。

log-to-metrics-anomaly-detector-pipeline 子管道接收来自 log-to-metrics-pipeline 子管道的聚合直方图指标,将其发送到异常探测器处理器,以便使用随机森林砍伐算法检测异常。如果检测到异常,则将其发送到名为 log-metric-anomalies 的 OpenSearch 索引。

version: "2" apache-log-pipeline-with-metrics: source: http: # Provide the path for ingestion. ${pipelineName} will be replaced with pipeline name configured for this pipeline. # In this case it would be "/apache-log-pipeline-with-metrics/logs". This will be the FluentBit output URI value. path: "/${pipelineName}/logs" processor: - grok: match: log: [ "%{COMMONAPACHELOG_DATATYPED}" ] sink: - opensearch: ... index: "logs" - pipeline: name: "log-to-metrics-pipeline" log-to-metrics-pipeline: source: pipeline: name: "apache-log-pipeline-with-metrics" processor: - aggregate: # Specify the required identification keys identification_keys: ["clientip", "request"] action: histogram: # Specify the appropriate values for each the following fields key: "bytes" record_minmax: true units: "bytes" buckets: [0, 25000000, 50000000, 75000000, 100000000] # Pick the required aggregation period group_duration: "30s" sink: - opensearch: ... index: "histogram_metrics" - pipeline: name: "log-to-metrics-anomaly-detector-pipeline" log-to-metrics-anomaly-detector-pipeline: source: pipeline: name: "log-to-metrics-pipeline" processor: - anomaly_detector: # Specify the key on which to run anomaly detection keys: [ "bytes" ] mode: random_cut_forest: sink: - opensearch: ... index: "log-metric-anomalies"

来自跟踪的指标

您可以从跟踪中派生指标,并在这些生成的指标中查找异常。在此示例中,entry-pipeline 子管道接收来自 OpenTelemetry Collector 的跟踪数据,并将其转发到以下子管道:

  • span-pipeline— 从跟踪中提取原始 span。将原始 span 发送到任何前缀为 otel-v1-apm-span 的 OpenSearch 索引。

  • service-map-pipeline – 执行聚合和分析,以创建表示服务间连接的文档。将这些文档发送到名为 otel-v1-apm-service-map 的 OpenSearch 索引。然后,您可以通过 OpenSearch 控制面板的跟踪分析插件查看服务映射的可视化效果。

  • trace-to-metrics-pipeline - 根据 serviceName 值聚合并从跟踪中派生直方图指标。然后,将派生的指标发送给名为 metrics_for_traces 的 OpenSearch 索引以及 trace-to-metrics-anomaly-detector-pipeline 子管道。

trace-to-metrics-anomaly-detector-pipeline 子管道接收来自 trace-to-metrics-pipeline 的聚合直方图指标,将其发送到异常探测器处理器,以便使用随机森林砍伐算法检测异常。如果检测到任何异常,则将其发送到名为 trace-metric-anomalies 的 OpenSearch 索引。

version: "2" entry-pipeline: source: otel_trace_source: # Provide the path for ingestion. ${pipelineName} will be replaced with pipeline name configured for this pipeline. # In this case it would be "/entry-pipeline/v1/traces". This will be endpoint URI path in OpenTelemetry Exporter # configuration. # path: "/${pipelineName}/v1/traces" processor: - trace_peer_forwarder: sink: - pipeline: name: "span-pipeline" - pipeline: name: "service-map-pipeline" - pipeline: name: "trace-to-metrics-pipeline" span-pipeline: source: pipeline: name: "entry-pipeline" processor: - otel_trace_raw: sink: - opensearch: ... index_type: "trace-analytics-raw" service-map-pipeline: source: pipeline: name: "entry-pipeline" processor: - service_map: sink: - opensearch: ... index_type: "trace-analytics-service-map" trace-to-metrics-pipeline: source: pipeline: name: "entry-pipeline" processor: - aggregate: # Pick the required identification keys identification_keys: ["serviceName"] action: histogram: # Pick the appropriate values for each the following fields key: "durationInNanos" record_minmax: true units: "seconds" buckets: [0, 10000000, 50000000, 100000000] # Pick the required aggregation period group_duration: "30s" sink: - opensearch: ... index: "metrics_for_traces" - pipeline: name: "trace-to-metrics-anomaly-detector-pipeline" trace-to-metrics-anomaly-detector-pipeline: source: pipeline: name: "trace-to-metrics-pipeline" processor: - anomaly_detector: # Below Key will find anomalies in the max value of histogram generated for durationInNanos. keys: [ "max" ] mode: random_cut_forest: sink: - opensearch: ... index: "trace-metric-anomalies"

OpenTelemetry 指标

您可以创建管道,接收 OpenTelemetry 指标并检测这些指标中存在的异常。在此示例中,entry-pipeline 接收来自 OpenTelemetry Collector 的指标数据。如果指标类型为 GAUGE 且指标名称为 totalApiBytesSent,则处理器会将其发送到 ad-pipeline 子管道。

ad-pipeline 子管道接收来自入口管道的指标数据,并使用异常探测器处理器基于指标值执行异常检测。

entry-pipeline: source: otel_metrics_source: processor: - otel_metrics: route: - gauge_route: '/kind = "GAUGE" and /name = "totalApiBytesSent"' sink: - pipeline: name: "ad-pipeline" routes: - gauge_route - opensearch: ... index: "otel-metrics" ad-pipeline: source: pipeline: name: "entry-pipeline" processor: - anomaly_detector: # Use "value" as the key on which anomaly detector needs to be run keys: [ "value" ] mode: random_cut_forest: sink: - opensearch: ... index: otel-metrics-anomalies

除此示例之外,您也可以使用跟踪到指标异常管道蓝图。有关蓝图的更多信息,请参阅 使用蓝图创建管道