命令函数为使用 OpenSearch PPL 的 CloudWatch Logs Insights 用户提供的其他信息

支持的 PPL 命令

下表显示了 OpenSearch 控制面板支持哪些用于查询 CloudWatch 日志、Amazon S3 或安全湖的 PPL 命令，以及 L CloudWatch ogs Insights 支持哪些命令。 CloudWatch Logs Insights 在查询 CloudWatch 日志时使用与 OpenSearch 仪表板相同的 PPL 语法，并且这些表将两者都称为 CloudWatch 日志。

注意

在 S OpenSearch ervice 之外分析数据时，命令的执行方式可能与对 OpenSearch 索引执行的命令不同。

主题

命令
函数
为使用 OpenSearch PPL 的 CloudWatch Logs Insights 用户提供的其他信息

命令

PPL 命令	说明	CloudWatch 日志	Amazon S3	安全湖	示例命令
fields 命令	显示一组需要投影的字段。	支持	支持	支持	`fields field1, field2`
where 命令	根据指定的条件筛选数据。	支持	支持	支持	`where field1="success" \| where field2 != "i -023fe0a90929d8822" \| fields field3, col4, col5, col6 \| head 1000`
stats 命令	执行聚合和计算。	支持	支持	支持	stats count(), count(`field1`), min(`field1`), max(`field1`), avg(`field1`) by field2 \| head 1000
parse 命令	从字符串中提取正则表达式（regex）模式，并显示提取出的模式。提取出的模式可进一步用于创建新字段或筛选数据。	支持	支持	支持	parse `field1` ".*/(?<field2>[^/]+$)" \| where field2 = "requestId" \| fields field2, `field2` \| head 1000
patterns 命令	从文本字段中提取日志模式，并将结果附加到搜索结果中。按日志模式进行分组，便于从海量日志数据中聚合统计信息，从而进行分析和故障排查。	不支持	支持	支持	`patterns new_field='no_numbers' pattern='[0-9]' message \| fields message, no_numbers`
sort 命令	按字段名称对显示的结果进行排序。使用 *sor FieldNamet-* 按降序排序。	支持	支持	支持	stats count(), count(`field1`), min(`field1`) as field1Alias, max(`field1`), avg(`field1`) by field2 \| sort -field1Alias \| head 1000
eval 命令	修改或处理字段的值，并将其存储在不同的字段中。这有助于对列进行数学修改、对列应用字符串函数或对列应用日期函数。	支持	支持	支持	eval field2 = `field1` * 2 \| fields field1, field2 \| head 20
rename 命令	重命名搜索结果中的一个或多个字段。	支持	支持	支持	`rename field2 as field1 \| fields field1`
head 命令	将显示的查询结果限制为前 N 行。	支持	支持	支持	fields `@message` \| head 20
grok 命令	使用基于正则表达式的 grok 模式解析文本字段，并将结果附加到搜索结果中。	支持	支持	支持	`grok email '.+@%{HOSTNAME:host}' \| fields email`
top 命令	查找字段中最频繁出现的值。	支持	支持	支持	`top 2 Field1 by Field2`
dedup 命令	根据指定的字段删除重复的条目。	支持	支持	支持	`dedup field1 \| fields field1, field2, field3`
join 命令	将两个数据集联接在一起。	支持	支持	支持	`source=customer \| join ON c_custkey = o_custkey orders \| head 10`
lookup 命令	通过添加或替换查询索引（维度表）中的数据，丰富搜索数据。可使用维度表中的值扩展索引的字段，在匹配查找条件时追加或替换值	不支持	支持	支持	`where orderType = 'Cancelled' \| lookup account_list, mkt_id AS mkt_code replace amount, account_name as name \| stats count(mkt_code), avg(amount) by name`
subquery 命令	在管道处理语言（PPL）语句中执行复杂的嵌套查询。	支持	支持	支持	`where id in [ subquery source=users \| where user in [ subquery source=actions \| where action="login" \| fields user ] \| fields uid ]`
rare 命令	查找字段列表中所有字段中出现频率最低的值。	支持	支持	支持	`rare Field1 by Field2`
trendline 命令	计算字段的移动平均值。	支持	支持	支持	`trendline sma(2, field1) as field1Alias`
eventstats 命令	使用计算得出的汇总统计数据丰富事件数据。它会分析您事件中的指定字段，计算各种统计指标，然后将这些结果作为新字段附加到每个原始事件上。	支持（`count()` 除外）	支持	支持	`eventstats sum(field1) by field2`
flatten 命令	展平字段，该字段必须是以下类型：`struct<?,?> or array<struct<?,?>>`	支持	支持	支持	`source=table \| flatten field1`
字段摘要	计算每个字段的基本统计数据（计数、非重复计数、最小值、最大值、平均值、标准差和平均值）。	支持（每个查询一个字段）	支持	支持	`where field1 != 200 \| fieldsummary includefields=field1 nulls=true`
fillnull 命令	使用您提供的值填充 null 字段。可在一个或多个字段中使用。	支持	支持	支持	`fields field1 \| eval field2=field1 \| fillnull value=0 field1`
expand 命令	将包含多个值的字段拆分为单独的行，为指定字段中的每个值创建新行。	支持	支持	支持	`expand employee \| stats max(salary) as max by state, company`
describe 命令	获取有关表、模式和目录的结构以及元数据的详细信息	不支持	支持	支持	`describe schema.table`

函数

PPL 函数	说明	CloudWatch 日志	Amazon S3	安全湖	示例命令
PPL 字符串函数 (`CONCAT`, `CONCAT_WS`, `LENGTH`, `LOWER`, `LTRIM`, `POSITION`, `REVERSE`, `RIGHT`, `RTRIM`, `SUBSTRING`, `TRIM`, `UPPER`)	PPL 中的内置函数，可在 PPL 查询中操作和转换字符串及文本数据。例如，转换大小写、合并字符串、提取部分内容以及清理文本。	支持	支持	支持	`eval col1Len = LENGTH(col1) \| fields col1Len`
PPL 日期和时间函数 (`DAY`, `DAYOFMONTH`, `DAY_OF_MONTH`,`DAYOFWEEK`, `DAY_OF_WEEK`, `DAYOFYEAR`, `DAY_OF_YEAR`, `DAYNAME`, `FROM_UNIXTIME`, `HOUR`, `HOUR_OF_DAY`, `LAST_DAY`, `LOCALTIMESTAMP`, `LOCALTIME`, `MAKE_DATE`, `MINUTE`, `MINUTE_OF_HOUR`, `MONTH`, `MONTHNAME`, `MONTH_OF_YEAR`, `NOW`, `QUARTER`, `SECOND`, `SECOND_OF_MINUTE`, `SUBDATE`, `SYSDATE`, `TIMESTAMP`, `UNIX_TIMESTAMP`, `WEEK`, `WEEKDAY`, `WEEK_OF_YEAR`, `DATE_ADD`, `DATE_SUB`, `TIMESTAMPADD`, `TIMESTAMPDIFF`, `UTC_TIMESTAMP`, `CURRENT_TIMEZONE`)	内置函数，用于处理和转换 PPL 查询中的日期和时间戳数据。例如，date_add、date_format、datediff 和 current_date。	支持	支持	支持	`eval newDate = ADDDATE(DATE('2020-08-26'), 1) \| fields newDate`
PPL 条件函数 (`EXISTS`, `IF`, `IFNULL`, `ISNOTNULL`, `ISNULL`, `NULLIF`)	内置函数，可对多行数据执行计算以生成单个汇总值。例如，sum、count、avg、max 和 min。	支持	支持	支持	`eval field2 = isnull(col1) \| fields field2, col1, field3`
PPL 数学函数 (`ABS`, `ACOS`, `ASIN`, `ATAN`, `ATAN2`, `CEIL`, `CEILING`, `CONV`, `COS`, `COT`, `CRC32`, `DEGREES`, `E`, `EXP`, `FLOOR`, `LN`, `LOG`, `LOG2`, `LOG10`, `MOD`, `PI`. `POW`, `POWER`, `RADIANS`, `RAND`, `ROUND`, `SIGN`, `SIN`, `SQRT`, `CBRT`)	内置函数，用于在 PPL 查询中执行数学计算和转换。例如：abs（绝对值）、round（四舍五入）、sqrt（平方根）、pow（乘方计算）和 ceil（向上取整到最接近的整数）。	支持	支持	支持	`eval field2 = ACOS(col1) \| fields col1`
PPL 表达式（算术运算符（`+`、`-`、`*`）；谓词运算符（`>. <`、`IN)`））	表达式（尤其是值表达式）的内置函数，返回标量值。表达式具有不同的类型和形式。	支持	支持	支持	`where age > (25 + 5) \| fields age`
PPL IP 地址函数 (`CIDRMATCH`)	用于处理 IP 地址的内置函数，例如 CIDR。	支持	支持	支持	`where cidrmatch(ip, '***********/24') \| fields ip`
PPL JSON 函数 (`ARRAY_LENGTH`, `ARRAY_LENGTH`, `JSON`, `JSON_ARRAY`, `JSON_EXTRACT`, `JSON_KEYS`, `JSON_OBJECT`, `JSON_VALID`, `TO_JSON_STRING`)	用于处理 JSON 的内置函数，包括数组、提取及验证。	支持	支持	支持	eval `json_extract('{"a":"b"}', '$.a')` = json_extract('{"a":"b"}', '$a')
PPL Lambda 函数 (`EXISTS`, `FILTER`, `REDUCE`, `TRANSFORM`)	用于处理 JSON 的内置函数，包括数组、提取及验证。	不支持	支持	支持	`eval array = json_array(1, -1, 2), result = filter(array, x -> x > 0) \| fields result`
PPL 加密哈希函数 (`MD5`, `SHA1`, `SHA2`)	内置函数，可生成数据的唯一指纹，这些指纹可用于验证、比较或作为更复杂安全协议的一部分。	支持	支持	支持	eval `MD5('hello')` = MD5('hello') \| fields `MD5('hello')`

为使用 OpenSearch PPL 的 CloudWatch Logs Insights 用户提供的其他信息

尽管 CloudWatch Logs Insights 支持大多数 OpenSearch PPL 命令和函数，但目前不支持某些命令和功能。例如，目前不支持 PPL 中的 Lookup 命令。自 2025 年 6 月 2 日起，L CloudWatch ogs Insights 现在支持 PPL 中的 JOIN、子查询、Flatten、Fillnull、Expand、Cidrmatch 和 JSON 函数。有关支持的查询命令和函数的完整列表，请参阅上表中的 Ama CloudWatch zon Logs 列。

示例查询和限额

以下内容适用于 CloudWatch Logs Insights OpenSearch 用户和查询 CloudWatch 数据的用户。

有关查询 OpenSearch 服务 CloudWatch 日志时适用的限制的信息，请参阅 Amazon Logs 用户指南中的CloudWatch CloudWatch 日志配额。限制包括您可以查询的 CloudWatch 日志组数量、可以执行的最大并发查询数、最长查询执行时间以及结果中返回的最大行数。无论您使用哪种语言查询 CloudWatch 日志（即 OpenSearch PPL、SQL 和 Logs Insights QL），限制都是一样的。

DDL 命令

comment

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

PPL 同时支持行注释和块注释。系统不会评估注释文本。

行注释

行注释以两个斜线 // 开头，以换行结束。

示例：


os> source=accounts | top gender // finds most common gender of all the accounts
fetched rows / total rows = 2/2
+----------+
| gender   |
|----------|
| M        |
| F        |
+----------+

块注释

块注释以斜杠后跟星号 \* 开头，以星号后跟斜杠 */ 结束。

示例：


os> source=accounts | dedup 2 gender /* dedup the document with gender field keep 2 duplication */ | fields account_number, gender
fetched rows / total rows = 3/3
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 6                | M        |
| 13               | F        |
+------------------+----------+

关联命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

您可以根据共同的维度和时间范围关联不同的数据来源。

当处理来自不同垂直领域的大量数据时，这种关联至关重要。这些数据虽共享相同时间段，却未经过正式同步。

通过根据时间范围和相似维度关联这些不同的数据来源，您可以丰富数据并挖掘有价值的见解。

示例

可观测性域包含三个不同的数据来源：

日志
指标
跟踪

这些数据来源可能共享共同的维度。要从一个数据来源过渡到另一个数据来源，您需要进行正确的关联。使用语义命名规范，可识别日志、跟踪和指标中的共享元素。

示例：


{
  "@timestamp": "2018-07-02T22:23:00.186Z",
  "aws": {
    "elb": {
      "backend": {
        "http": {
          "response": {
            "status_code": 500
          }
        },
        "ip": "********",
        "port": "80"
      },
      ...
     "target_port": [
        "10.0.0.1:80"
      ],
      "target_status_code": [
        "500"
      ],
      "traceId": "Root=1-58337262-36d228ad5d99923122bbe354",
      "type": "http"
    }
  },
  "cloud": {
    "provider": "aws"
  },
  "http": {
    "request": {
    ...
  },
  "communication": {
    "source": {
      "address": "**************",
      "ip": "**************",
      "port": 2817
    }
  },
  "traceId": "Root=1-58337262-36d228ad5d99923122bbe354"
}

此示例显示了来自驻留在其上的Amazon服务的 Amazon ELB 日志。此示例显示状态码为 500 的后端 HTTP 响应，表示存在错误。这可能触发警报，也可能成为常规监控过程的一部分。下一步是收集有关此事件的相关数据，以便开展全面调查。

虽然您可能想查询与时间范围相关的所有数据，但这种方法可能令人不堪重负。您最终可能面临信息过载的困境，耗费更多时间筛选无关数据，反而难以找出根本原因。

相反，您可采用更具针对性的方法，通过关联不同数据来源的信息以实现目标。可使用以下维度实现关联：

IP："ip": "10.0.0.1" | "ip": "**************"
端口："port": 2817 | "target_port": "10.0.0.1:80"

假设您能够访问其他跟踪和指标索引，并且熟悉模式结构，则可以创建更精确的关联查询。

以下是可能需要进行关联处理的跟踪索引文档示例，其中包含 HTTP 信息：


{
  "traceId": "c1d985bd02e1dbb85b444011f19a1ecc",
  "spanId": "55a698828fe06a42",
  "traceState": [],
  "parentSpanId": "",
  "name": "mysql",
  "kind": "CLIENT",
  "@timestamp": "2021-11-13T20:20:39+00:00",
  "events": [
    {
      "@timestamp": "2021-03-25T17:21:03+00:00",
       ...
    }
  ],
  "links": [
    {
      "traceId": "c1d985bd02e1dbb85b444011f19a1ecc",
      "spanId": "55a698828fe06a42w2",
      },
      "droppedAttributesCount": 0
    }
  ],
  "resource": {
    "service@name": "database",
    "telemetry@sdk@name": "opentelemetry",
    "host@hostname": "ip-172-31-10-8.us-west-2.compute.internal"
  },
  "status": {
    ...
  },
  "attributes": {
    "http": {
      "user_agent": {
        "original": "Mozilla/5.0"
      },
      "network": {
         ...
        }
      },
      "request": {
         ...
        }
      },
      "response": {
        "status_code": "200",
        "body": {
          "size": 500
        }
      },
      "client": {
        "server": {
          "socket": {
            "address": "***********",
            "domain": "example.com",
            "port": 80
          },
          "address": "***********",
          "port": 80
        },
        "resend_count": 0,
        "url": {
          "full": "http://example.com"
        }
      },
      "server": {
        "route": "/index",
        "address": "***********",
        "port": 8080,
        "socket": {
         ...
        },
        "client": {
         ...
         }
        },
        "url": {
         ...
        }
      }
    }
  }
}

通过这种方法，你可以看到可以与 elb 日志关联 client/server ip的 http，从而更好地了解系统的行为和状况。traceId

新的关联查询命令

以下是允许此类调查的新命令：


source alb_logs, traces | where alb_logs.ip="10.0.0.1" AND alb_logs.cloud.provider="aws"| 
correlate exact fields(traceId, ip) scope(@timestamp, 1D) mapping(alb_logs.ip = traces.attributes.http.server.address, alb_logs.traceId = traces.traceId )

以下是命令中各部分的功能说明：

source alb_logs, traces：这用于选择要关联的数据来源。
where ip="10.0.0.1" AND cloud.provider="aws"：这用于缩小搜索范围。
correlate exact fields(traceId, ip)：这指示系统根据下列字段的精确匹配关联数据：
- ip 字段具有显式筛选条件，因此可用于所有数据来源的关联。
- traceId 字段没有显式筛选条件，因此可用于匹配所有数据来源中相同的 traceId。

字段名称表明该函数在关联命令中的逻辑含义。实际的联接条件取决于您提供的映射语句。

术语 exact 表示关联语句要求所有字段完全匹配才能完成查询语句。

术语 approximate 将尝试在最佳情况下进行匹配，且不会拒绝包含部分匹配的行。

处理不同的字段映射

如果同一逻辑字段（例如 ip）在不同数据来源中具有不同的名称，则需要提供路径字段的显式映射关系。为解决此问题，您可以扩展关联条件，使不同字段名称在逻辑含义相近时也能匹配。具体操作方法如下：


alb_logs.ip = traces.attributes.http.server.address, alb_logs.traceId = traces.traceId

对于参与关联联接的每个字段，都应提供相关的映射语句，其中包括要通过此关联命令进行连接的所有表。

示例

在此示例中，有 2 个来源：alb_logs, traces

有 2 个字段：traceId, ip

有 2 个映射语句：alb_logs.ip = traces.attributes.http.server.address, alb_logs.traceId = traces.traceId

确定相关时间范围

要简化执行引擎（驱动程序）的工作，可以添加 scope 语句。这明确指示联接查询应在此搜索的时间范围内进行。

scope(@timestamp, 1D) i

在此示例中，搜索范围以每日为基准，因此同一天出现的关联会归为一组。此范围限定机制可简化操作并提升结果控制精度，支持根据需求进行增量检索解析。

支持驱动程序

新的关联命令实际上是 'hidden' 联接命令。因此，仅以下 PPL 驱动程序支持此命令。在这些驱动程序中，关联命令将直接转换为相应的 Catalyst Join 逻辑计划。

示例

source alb_logs, traces, metrics | where ip="10.0.0.1" AND cloud.provider="aws"| correlate exact on (ip, port) scope(@timestamp, 2018-07-02T22:23:00, 1 D)

逻辑计划：


'Project [*]
+- 'Join Inner, ('ip && 'port)
   :- 'Filter (('ip === "10.0.0.1" & 'cloud.provider === "aws") & inTimeScope('@timestamp, "2018-07-02T22:23:00", "1 D"))
      +- 'UnresolvedRelation [alb_logs]
   +- 'Join Inner, ('ip & 'port)
      :- 'Filter (('ip === "10.0.0.1" & 'cloud.provider === "aws") & inTimeScope('@timestamp, "2018-07-02T22:23:00", "1 D"))
         +- 'UnresolvedRelation [traces]
      +- 'Filter (('ip === "10.0.0.1" & 'cloud.provider === "aws") & inTimeScope('@timestamp, "2018-07-02T22:23:00", "1 D"))
         +- 'UnresolvedRelation [metrics]

催化引擎根据最高效的联接顺序优化此查询。

dedup 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 dedup 命令，根据指定字段从搜索结果中移除重复文档。

语法

使用以下语法：


dedup [int] <field-list> [keepempty=<bool>] [consecutive=<bool>]

`int`

可选。
当指定 <int> 时，dedup 命令会为每个组合保留多个事件。<int> 的数字必须大于 0。如果未指定数字，则仅保留第一个出现的事件。所有其他重复项均从结果中移除。
默认：1

`keepempty`

可选。
如果为 true，则保留字段列表中任何字段值为 NULL 或 MISSING 的文档。
默认：false

`consecutive`

可选。
如果为 true，则仅移除具有连续重复值组合的事件。
默认：false

`field-list`

必需。
以逗号分隔的字段列表。至少需要填写一个字段。

示例 1：按单一字段去重

该示例说明如何使用性别字段对文档进行去重处理。

PPL 查询：


os> source=accounts | dedup gender | fields account_number, gender;
fetched rows / total rows = 2/2
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 13               | F        |
+------------------+----------+

示例 2：保留 2 份重复文档

该示例说明如何使用性别字段对文档进行去重处理，同时保留两份重复文档。

PPL 查询：


os> source=accounts | dedup 2 gender | fields account_number, gender;
fetched rows / total rows = 3/3
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 6                | M        |
| 13               | F        |
+------------------+----------+

示例 3：默认保留或忽略空字段

该示例说明如何通过保留 null 值字段对文档进行去重处理。

PPL 查询：


os> source=accounts | dedup email keepempty=true | fields account_number, email;
fetched rows / total rows = 4/4
+------------------+-----------------------+
| account_number   | email                 |
+------------------+-----------------------+
| 1                | john_doe@example.com  |
| 6                | jane_doe@example.com  |
| 13               | null                  |
| 18               | juan_li@example.com   |
+------------------+-----------------------+

该示例说明如何通过忽略空值字段对文档进行去重处理。

PPL 查询：


os> source=accounts | dedup email | fields account_number, email;
fetched rows / total rows = 3/3
+------------------+-----------------------+
| account_number   | email                 |
+------------------+-----------------------+
| 1                | john_doe@example.com  |
| 6                | jane_doe@example.com  |
| 18               | juan_li@example.com   |
+------------------+-----------------------+

示例 4：连续文档中的去重

该示例说明如何在连续文档中进行去重处理。

PPL 查询：


os> source=accounts | dedup gender consecutive=true | fields account_number, gender;
fetched rows / total rows = 3/3
+------------------+----------+
| account_number   | gender   |
+------------------+----------+
| 1                | M        |
| 13               | F        |
| 18               | M        |
+------------------+----------+

其他示例

source = table | dedup a | fields a,b,c
source = table | dedup a,b | fields a,b,c
source = table | dedup a keepempty=true | fields a,b,c
source = table | dedup a,b keepempty=true | fields a,b,c
source = table | dedup 1 a | fields a,b,c
source = table | dedup 1 a,b | fields a,b,c
source = table | dedup 1 a keepempty=true | fields a,b,c
source = table | dedup 1 a,b keepempty=true | fields a,b,c
source = table | dedup 2 a | fields a,b,c
source = table | dedup 2 a,b | fields a,b,c
source = table | dedup 2 a keepempty=true | fields a,b,c
source = table | dedup 2 a,b keepempty=true | fields a,b,c
source = table | dedup 1 a consecutive=true| fields a,b,c（不支持连续去重处理）

限制

对于 | dedup 2 a, b keepempty=false


DataFrameDropColumns('_row_number_)
+- Filter ('_row_number_ <= 2) // allowed duplication = 2
   +- Window [row_number() windowspecdefinition('a, 'b, 'a ASC NULLS FIRST, 'b ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS _row_number_], ['a, 'b], ['a ASC NULLS FIRST, 'b ASC NULLS FIRST]
       +- Filter (isnotnull('a) AND isnotnull('b)) // keepempty=false
          +- Project
             +- UnresolvedRelation

对于 | dedup 2 a, b keepempty=true


Union
:- DataFrameDropColumns('_row_number_)
:  +- Filter ('_row_number_ <= 2)
:     +- Window [row_number() windowspecdefinition('a, 'b, 'a ASC NULLS FIRST, 'b ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS _row_number_], ['a, 'b], ['a ASC NULLS FIRST, 'b ASC NULLS FIRST]
:        +- Filter (isnotnull('a) AND isnotnull('b))
:           +- Project
:              +- UnresolvedRelation
+- Filter (isnull('a) OR isnull('b))
   +- Project
      +- UnresolvedRelation

describe 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 describe 命令，获取有关表、模式和目录的结构以及元数据的详细信息。以下是 describe 命令的不同示例和使用案例。

描述

describe table 此命令等同于 DESCRIBE EXTENDED table SQL 命令
describe schema.table
describe schema.`table`
describe catalog.schema.table
describe catalog.schema.`table`
describe `catalog`.`schema`.`table`

eval 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

eval 命令计算表达式，并将结果附加到搜索结果中。

语法

使用以下语法：


eval <field>=<expression> ["," <field>=<expression> ]...

field：必需。如果字段名称不存在，则会添加新的字段。如果已存在字段名称，则会进行覆盖。
expression：必需。系统支持的任何表达式。

示例 1：创建新字段

此示例说明如何为每个文档创建新 doubleAge 字段。新的 doubleAge 是年龄乘以 2 的评估结果。

PPL 查询：


os> source=accounts | eval doubleAge = age * 2 | fields age, doubleAge ;
fetched rows / total rows = 4/4
+-------+-------------+
| age   | doubleAge   |
|-------+-------------|
| 32    | 64          |
| 36    | 72          |
| 28    | 56          |
| 33    | 66          |
+-------+-------------+

示例 2：覆盖现有字段

此示例说明如何使用当前年龄加 1 覆盖年龄字段。

PPL 查询：


os> source=accounts | eval age = age + 1 | fields age ;
fetched rows / total rows = 4/4
+-------+
| age   |
|-------|
| 33    |
| 37    |
| 29    |
| 34    |
+-------+

示例 3：使用在 eval 中定义的字段创建新字段

此示例说明如何使用在 eval 命令中定义的字段创建新 ddAge 字段。新字段 ddAge 是 doubleAge 乘以 2 的计算结果，其中 doubleAge 在 eval 命令中进行定义。

PPL 查询：


os> source=accounts | eval doubleAge = age * 2, ddAge = doubleAge * 2 | fields age, doubleAge, ddAge ;
fetched rows / total rows = 4/4
+-------+-------------+---------+
| age   | doubleAge   | ddAge   |
|-------+-------------+---------|
| 32    | 64          | 128     |
| 36    | 72          | 144     |
| 28    | 56          | 112     |
| 33    | 66          | 132     |
+-------+-------------+---------+

假设：a、b、c 是 table 中的现有存在字段

其他示例

source = table | eval f = 1 | fields a,b,c,f
source = table | eval f = 1（输出 a、b、c、f 字段）
source = table | eval n = now() | eval t = unix_timestamp(a) | fields n,t
source = table | eval f = a | where f > 1 | sort f | fields a,b,c | head 5
source = table | eval f = a * 2 | eval h = f * 2 | fields a,f,h
source = table | eval f = a * 2, h = f * 2 | fields a,f,h
source = table | eval f = a * 2, h = b | stats avg(f) by h
source = table | eval f = ispresent(a)
source = table | eval r = coalesce(a, b, c) | fields r
source = table | eval e = isempty(a) | fields e
source = table | eval e = isblank(a) | fields e
source = table | eval f = case(a = 0, 'zero', a = 1, 'one', a = 2, 'two', a = 3, 'three', a = 4, 'four', a = 5, 'five', a = 6, 'six', a = 7, 'se7en', a = 8, 'eight', a = 9, 'nine')
source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else 'unknown')
source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else concat(a, ' is an incorrect binary digit'))
source = table | eval f = a in ('foo', 'bar') | fields f
source = table | eval f = a not in ('foo', 'bar') | fields f

使用案例示例进行评估：


source = table | eval e = eval status_category =
case(a >= 200 AND a < 300, 'Success',
a >= 300 AND a < 400, 'Redirection',
a >= 400 AND a < 500, 'Client Error',
a >= 500, 'Server Error'
else 'Unknown')

使用其他案例示例进行评估：

假设：a、b、c 是 table 中的现有存在字段

其他示例

source = table | eval f = 1 | fields a,b,c,f
source = table | eval f = 1（输出 a、b、c、f 字段）
source = table | eval n = now() | eval t = unix_timestamp(a) | fields n,t
source = table | eval f = a | where f > 1 | sort f | fields a,b,c | head 5
source = table | eval f = a * 2 | eval h = f * 2 | fields a,f,h
source = table | eval f = a * 2, h = f * 2 | fields a,f,h
source = table | eval f = a * 2, h = b | stats avg(f) by h
source = table | eval f = ispresent(a)
source = table | eval r = coalesce(a, b, c) | fields r
source = table | eval e = isempty(a) | fields e
source = table | eval e = isblank(a) | fields e
source = table | eval f = case(a = 0, 'zero', a = 1, 'one', a = 2, 'two', a = 3, 'three', a = 4, 'four', a = 5, 'five', a = 6, 'six', a = 7, 'se7en', a = 8, 'eight', a = 9, 'nine')
source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else 'unknown')
source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else concat(a, ' is an incorrect binary digit'))
source = table | eval f = a in ('foo', 'bar') | fields f
source = table | eval f = a not in ('foo', 'bar') | fields f

使用案例示例进行评估：


source = table | eval e = eval status_category =
case(a >= 200 AND a < 300, 'Success',
a >= 300 AND a < 400, 'Redirection',
a >= 400 AND a < 500, 'Client Error',
a >= 500, 'Server Error'
else 'Unknown')

使用其他案例示例进行评估：


source = table |  where ispresent(a) |
eval status_category =
 case(a >= 200 AND a < 300, 'Success',
  a >= 300 AND a < 400, 'Redirection',
  a >= 400 AND a < 500, 'Client Error',
  a >= 500, 'Server Error'
  else 'Incorrect HTTP status code'
 )
 | stats count() by status_category

限制

不支持覆盖现有字段。尝试这样做的查询将抛出异常，并显示消息 "Reference 'a' is ambiguous"。


- `source = table | eval a = 10 | fields a,b,c`
- `source = table | eval a = a * 2 | stats avg(a)`
- `source = table | eval a = abs(a) | where a > 0`
- `source = table | eval a = signum(a) | where a < 0`

eventstats 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 eventstats 命令，通过计算得出的汇总统计数据丰富事件数据。该功能通过分析事件中的指定字段，计算各种统计指标，并将这些结果作为新字段附加到每个原始事件中。

eventstats 的关键方面

可在整个结果集或定义的组内执行计算。
原始事件保持不变，新增字段用于存储统计结果。
该命令特别适用于比较分析、识别异常值或为单个事件提供额外背景信息。

stats 和 eventstats 之间的区别

stats 和 eventstats 命令均用于计算统计数据，但在操作方式和输出结果方面存在关键差异。

输出格式

stats：生成仅包含计算统计数据的汇总表。
eventstats：将计算出的统计数据作为新字段添加到现有事件中，同时保留原始数据。

事件保留

stats：将结果集简化为仅包含统计摘要，舍弃单个事件。
eventstats：保留所有原始事件，并添加包含计算统计数据的新字段。

使用案例

stats：最适合创建摘要报告或控制面板。常用于汇总结果的最终命令。
eventstats：需要使用统计背景丰富事件以进行进一步分析或筛选时，这很有用。可在搜索过程中使用，以添加可在后续命令中使用的统计数据。

语法

使用以下语法：


eventstats <aggregation>... [by-clause]

聚合

必需。
聚合函数。
聚合的参数必须是一个字段。

by-clause

可选。
语法：by [span-expression,] [field,]...
by 子句可以包含字段和表达式，例如标量函数和聚合函数。也可以使用 span 子句将特定字段拆分为间隔相等的存储桶。eventstats 命令随后基于这些 span 存储桶执行聚合操作。
默认：如果未指定 by 子句，eventstats 命令将对整个结果集进行聚合。

span-expression

可选，最多一个。
语法：span(field_expr, interval_expr)
默认情况下，间隔表达式的单位为自然单位。然而，对于日期和时间类型的字段，使用日期/时间单位时，需要在间隔表达式中指定单位。

例如，要按照 10 年为单位将 age 字段拆分为存储桶，请使用 span(age, 10)。对于基于时间的字段，可使用 span(timestamp, 1h) 将 timestamp 字段拆分为每小时间隔。

可用时间单位
跨度单位
毫秒（ms）
秒（s）
分钟（m，区分大小写）
小时（h）
天（d）
周（w）
月（M，区分大小写）
季度（q）
年（y）

聚合函数

`COUNT`

COUNT 返回 SELECT 语句检索到的行中 expr 出现的次数。

对于 CloudWatch 日志，COUNT不支持使用查询。

示例：


os> source=accounts | eventstats count();
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+--------------------+------------+--------------------------+--------+-------+---------+
| account_number | balance  | firstname | lastname | age | gender | address            | employer   | email                    | city   | state | count() |
+----------------+----------+-----------+----------+-----+--------+--------------------+------------+--------------------------+--------+-------+---------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane       | AnyCorp    | janedoe@anycorp.com      | Brogan | IL    | 4       |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street | AnyCompany | marymajor@anycompany.com | Dante  | TN    | 4       |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street     | AnyOrg     |                          | Nogal  | VA    | 4       |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court  |            | juanli@exampleorg.com    | Orick  | MD    | 4       |
+----------------+----------+-----------+----------+-----+--------+--------------------+------------+--------------------------+--------+-------+---------+

`SUM`

SUM(expr) 返回 expr 的和。

示例：


os> source=accounts | eventstats sum(age) by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+--------------------------+--------+-------+--------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer   | email                    | city   | state | sum(age) by gender |
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+--------------------------+--------+-------+--------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp    | janedoe@anycorp.com      | Brogan | IL    | 101                |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | AnyCompany | marymajor@anycompany.com | Dante  | TN    | 101                |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg     |                          | Nogal  | VA    | 28                 |
| 18             | 4180     | Juan      | Li       | 33  | M      | 467 Example Court     |            | juanli@exampleorg.com    | Orick  | MD    | 101                |
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+--------------------------+--------+-------+--------------------+

`AVG`

AVG(expr) 返回 expr 的平均值。

示例：


os> source=accounts | eventstats avg(age) by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+---------------------------+--------+-------+--------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | avg(age) by gender |
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+---------------------------+--------+-------+--------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 33.67              |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 33.67              |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 28.00              |
| 18             | 4180     | Juan      | Li       | 33  | M      | 467 Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 33.67              |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------+

MAX

MAX(expr) 返回 expr 的最大值。

示例


os> source=accounts | eventstats max(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | max(age)  |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 36        |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 36        |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 36        |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 36        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+

MIN

MIN(expr) 返回 expr 的最小值。

示例


os> source=accounts | eventstats min(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | min(age)  |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 28        |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 28        |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | *** Any Street        | AnyOrg      |                          | Nogal  | VA    | 28        |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 28        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+

STDDEV_SAMP

STDDEV_SAMP(expr) 返回 expr 的样本标准差。

示例


os> source=accounts | eventstats stddev_samp(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | stddev_samp(age)       |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 3.304037933599835      |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 3.304037933599835      |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 3.304037933599835      |
| 18             | 4180     | Juan      | Li       | 33  | M      | 467 Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 3.304037933599835      |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+

STDDEV_POP

STDDEV_POP(expr) 返回 expr 的总体标准差。

示例


os> source=accounts | eventstats stddev_pop(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | stddev_pop(age)        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 2.****************     |
| 6              | 5686     | Mary      | Major    | 36  | M      | *** Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 2.****************     |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | *** Any Street        | AnyOrg      |                          | Nogal  | VA    | 2.****************     |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 2.****************     |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+

PERCENTILE 或 PERCENTILE_APPROX

PERCENTILE(expr, percent) 或 PERCENTILE_APPROX(expr, percent) 返回 expr 在指定百分比处的近似百分位数值。

百分比

该数值必须是 0 到 100 之间的常数。

示例


os> source=accounts | eventstats percentile(age, 90) by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | percentile(age, 90) by gender  |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 36                             |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 36                             |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 28                             |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 36                             |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------------------+

示例 1：按组计算字段的平均值、总和及计数

该示例显示按性别分组计算所有账户的平均年龄、总年龄和事件计数。


os> source=accounts | eventstats avg(age) as avg_age, sum(age) as sum_age, count() as count by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+-----------+-------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | avg_age   | sum_age   | count |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+-----------+-------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 33.666667 | 101       | 3     |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 33.666667 | 101       | 3     |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 28.000000 | 28        | 1     |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 33.666667 | 101       | 3     |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+-----------+-------+

示例 2：按跨度计算计数

该示例以 10 年为间隔统计年龄计数。


os> source=accounts | eventstats count(age) by span(age, 10) as age_span
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+----------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | age_span |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+----------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 3        |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 3        |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 1        |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 3        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+----------+

示例 3：按性别和跨度计算计数

该示例以 5 年为间隔统计年龄计数，并按性别进行分组。


os> source=accounts | eventstats count() as cnt by span(age, 5) as age_span, gender
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+---------------------------+--------+-------+-----+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                     | city   | state | cnt |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+---------------------------+--------+-------+-----+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com       | Brogan | IL    | 2   |
| 6              | 5686     | Mary      | Majo     | 36  | M      | 671 Example Street    | Any Company | hattiebond@anycompany.com | Dante  | TN    | 1   |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | *** Any Street        | AnyOrg      |                           | Nogal  | VA    | 1   |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com     | Orick  | MD    | 2   |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+---------------------------+--------+-------+-----+

用法

source = table | eventstats avg(a)
source = table | where a < 50 | eventstats avg(c)
source = table | eventstats max(c) by b
source = table | eventstats count(c) by b | head 5
source = table | eventstats distinct_count(c)
source = table | eventstats stddev_samp(c)
source = table | eventstats stddev_pop(c)
source = table | eventstats percentile(c, 90)
source = table | eventstats percentile_approx(c, 99)

带跨度聚合

source = table | eventstats count(a) by span(a, 10) as a_span
source = table | eventstats sum(age) by span(age, 5) as age_span | head 2
source = table | eventstats avg(age) by span(age, 20) as age_span, country | sort - age_span | head 2

具有时间窗跨度的聚合（翻转窗口函数）

source = table | eventstats sum(productsAmount) by span(transactionDate, 1d) as age_date | sort age_date
source = table | eventstats sum(productsAmount) by span(transactionDate, 1w) as age_date, productId

按多个级别对聚合分组

source = table | eventstats avg(age) as avg_state_age by country, state | eventstats avg(avg_state_age) as avg_country_age by country
source = table | eventstats avg(age) as avg_city_age by country, state, city | eval new_avg_city_age = avg_city_age - 1 | eventstats avg(new_avg_city_age) as avg_state_age by country, state | where avg_state_age > 18 | eventstats avg(avg_state_age) as avg_adult_country_age by country

expand 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 expand 命令，展平以下类型的字段：

Array<Any>
Map<Any>

语法

使用以下语法：


expand <field> [As alias]

字段

要扩展（展开）的字段。必须是受支持的类型。

别名

可选。要使用的名称，而非原始字段名称。

用法

expand 命令为指定数组或映射字段中的每个元素生成一行，其中：

数组元素单独成行。
映射键值对拆分为不同的行，每个键值都表示为一行。

提供别名时，展开后的值将显示在别名下，而非原始字段名称下。
这可与其他命令（例如 stats、eval 和 parse）结合使用，用于在扩展后对数据进行操作或提取。

示例

source = table | expand employee | stats max(salary) as max by state, company
source = table | expand employee as worker | stats max(salary) as max by state, company
source = table | expand employee as worker | eval bonus = salary * 3 | fields worker, bonus
source = table | expand employee | parse description '(?<email>.+@.+)' | fields employee, email
source = table | eval array=json_array(1, 2, 3) | expand array as uid | fields name, occupation, uid
source = table | expand multi_valueA as multiA | expand multi_valueB as multiB

explain 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

explain 命令可帮助您理解查询执行计划，以分析和优化查询，从而提升性能。本节简要概述 explain 命令的用途及其在查询优化中的重要性。

Comment

source=accounts | top gender // finds most common gender of all the accounts（行注释）
source=accounts | dedup 2 gender /* dedup the document with gender field keep 2 duplication */ | fields account_number, gender（块注释）

描述

describe table 此命令等同于 DESCRIBE EXTENDED table SQL 命令
describe schema.table
describe schema.`table`
describe catalog.schema.table
describe catalog.schema.`table`
describe `catalog`.`schema`.`table`

解释

explain simple | source = table | where a = 1 | fields a,b,c
explain extended | source = table
explain codegen | source = table | dedup a | fields a,b,c
explain cost | source = table | sort a | fields a,b,c
explain formatted | source = table | fields - a
explain simple | describe table

字段

source = table
source = table | fields a,b,c
source = table | fields + a,b,c
source = table | fields - b,c
source = table | eval b1 = b | fields - b1,c

字段摘要

source = t | fieldsummary includefields=status_code nulls=false
source = t | fieldsummary includefields= id, status_code, request_path nulls=true
source = t | where status_code != 200 | fieldsummary includefields= status_code nulls=true

嵌套字段

source = catalog.schema.table1, catalog.schema.table2 | fields A.nested1, B.nested1
source = catalog.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields int_col, struct_col.field1.subfield, struct_col2.field1.subfield
source = catalog.schema.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields int_col, struct_col.field1.subfield, struct_col2.field1.subfield

筛选条件

source = table | where a = 1 | fields a,b,c
source = table | where a >= 1 | fields a,b,c
source = table | where a < 1 | fields a,b,c
source = table | where b != 'test' | fields a,b,c
source = table | where c = 'test' | fields a,b,c | head 3
source = table | where ispresent(b)
source = table | where isnull(coalesce(a, b)) | fields a,b,c | head 3
source = table | where isempty(a)
source = table | where isblank(a)
source = table | where case(length(a) > 6, 'True' else 'False') = 'True'
source = table | where a not in (1, 2, 3) | fields a,b,c
source = table | where a between 1 and 4：注意：这将返回 >= 1 和 <= 4，即 [1, 4]
source = table | where b not between '2024-09-10' and '2025-09-10'：注意：这将返回 b >= '********** '和 b <=' 2025-09-10 '
source = table | where cidrmatch(ip, '***********/24')
source = table | where cidrmatch(ipv6, '2003:db8::/32')
source = table | trendline sma(2, temperature) as temp_trend

IP 相关查询

source = table | where cidrmatch(ip, '**************')
source = table | where isV6 = false and isValid = true and cidrmatch(ipAddress, '**************')
source = table | where isV6 = true | eval inRange = case(cidrmatch(ipAddress, '2003:***::/32'), 'in' else 'out') | fields ip, inRange

复杂筛选条件


source = table | eval status_category =
case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
else 'Incorrect HTTP status code')
| where case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
else 'Incorrect HTTP status code'
) = 'Incorrect HTTP status code'


source = table
| eval factor = case(a > 15, a - 14, isnull(b), a - 7, a < 3, a + 1 else 1)
| where case(factor = 2, 'even', factor = 4, 'even', factor = 6, 'even', factor = 8, 'even' else 'odd') = 'even'
| stats count() by factor

具有逻辑条件的筛选器

source = table | where c = 'test' AND a = 1 | fields a,b,c
source = table | where c != 'test' OR a > 1 | fields a,b,c | head 1
source = table | where c = 'test' NOT a > 1 | fields a,b,c

Eval

假设：a、b、c 是 table 中的现有存在字段

source = table | eval f = 1 | fields a,b,c,f
source = table | eval f = 1（输出 a、b、c、f 字段）
source = table | eval n = now() | eval t = unix_timestamp(a) | fields n,t
source = table | eval f = a | where f > 1 | sort f | fields a,b,c | head 5
source = table | eval f = a * 2 | eval h = f * 2 | fields a,f,h
source = table | eval f = a * 2, h = f * 2 | fields a,f,h
source = table | eval f = a * 2, h = b | stats avg(f) by h
source = table | eval f = ispresent(a)
source = table | eval r = coalesce(a, b, c) | fields r
source = table | eval e = isempty(a) | fields e
source = table | eval e = isblank(a) | fields e
source = table | eval f = case(a = 0, 'zero', a = 1, 'one', a = 2, 'two', a = 3, 'three', a = 4, 'four', a = 5, 'five', a = 6, 'six', a = 7, 'se7en', a = 8, 'eight', a = 9, 'nine')
source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else 'unknown')
source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else concat(a, ' is an incorrect binary digit'))
source = table | eval digest = md5(fieldName) | fields digest
source = table | eval digest = sha1(fieldName) | fields digest
source = table | eval digest = sha2(fieldName,256) | fields digest
source = table | eval digest = sha2(fieldName,512) | fields digest

fillnull 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

说明

使用 fillnull 命令，将搜索结果的一个或多个字段中的 null 值替换为指定值。

语法

使用以下语法：


fillnull [with <null-replacement> in <nullable-field>["," <nullable-field>]] | [using <source-field> = <null-replacement> [","<source-field> = <null-replacement>]]

null-replacement：必需。用于替换 null 值的值。
nullable-field：必需。字段参考。此字段中的 null 值将替换为 null-replacement 中指定的值。

示例 1：Fillnull 一个字段

该示例展示如何在单个字段上使用 fillnull：


os> source=logs | fields status_code | eval input=status_code | fillnull with 0 in status_code;
| input | status_code |
|-------|-------------|
| 403   | 403         |
| 403   | 403         |
| NULL  | 0           |
| NULL  | 0           |
| 200   | 200         |
| 404   | 404         |
| 500   | 500         |
| NULL  | 0           |
| 500   | 500         |
| 404   | 404         |
| 200   | 200         |
| 500   | 500         |
| NULL  | 0           |
| NULL  | 0           |
| 404   | 404         |

示例 2：将 Fillnull 应用于多个字段

该示例展示应用于多个字段的 fillnull。


os> source=logs | fields request_path, timestamp | eval input_request_path=request_path, input_timestamp = timestamp | fillnull with '???' in request_path, timestamp;
| input_request_path | input_timestamp       | request_path | timestamp              |
|------------------------------------------------------------------------------------|
| /contact           | NULL                  | /contact     | ???                    |
| /home              | NULL                  | /home        | ???                    |
| /about             | 2023-10-01 10:30:00   | /about       | 2023-10-01 10:30:00    |
| /home              | 2023-10-01 10:15:00   | /home        | 2023-10-01 10:15:00    |
| NULL               | 2023-10-01 10:20:00   | ???          | 2023-10-01 10:20:00    |
| NULL               | 2023-10-01 11:05:00   | ???          | 2023-10-01 11:05:00    |
| /about             | NULL                  | /about       | ???                    |
| /home              | 2023-10-01 10:00:00   | /home        | 2023-10-01 10:00:00    |
| /contact           | NULL                  | /contact     | ???                    |
| NULL               | 2023-10-01 10:05:00   | ???          | 2023-10-01 10:05:00    |
| NULL               | 2023-10-01 10:50:00   | ???          | 2023-10-01 10:50:00    |
| /services          | NULL                  | /services    | ???                    |
| /home              | 2023-10-01 10:45:00   | /home        | 2023-10-01 10:45:00    |
| /services          | 2023-10-01 11:00:00   | /services    | 2023-10-01 11:00:00    |
| NULL               | 2023-10-01 10:35:00   | ???          | 2023-10-01 10:35:00    |

示例 3：将 Fillnull 应用于多个字段，且各字段具有不同的 null 替换值。

该示例展示使用多种值替换 null 值的 fillnull 方法。

request_path 字段中的 /error
timestamp 字段中的 1970-01-01 00:00:00


os> source=logs | fields request_path, timestamp | eval input_request_path=request_path, input_timestamp = timestamp | fillnull using request_path = '/error', timestamp='1970-01-01 00:00:00';

| input_request_path | input_timestamp       | request_path | timestamp              |
|------------------------------------------------------------------------------------|
| /contact           | NULL                  | /contact     | 1970-01-01 00:00:00    |
| /home              | NULL                  | /home        | 1970-01-01 00:00:00    |
| /about             | 2023-10-01 10:30:00   | /about       | 2023-10-01 10:30:00    |
| /home              | 2023-10-01 10:15:00   | /home        | 2023-10-01 10:15:00    |
| NULL               | 2023-10-01 10:20:00   | /error       | 2023-10-01 10:20:00    |
| NULL               | 2023-10-01 11:05:00   | /error       | 2023-10-01 11:05:00    |
| /about             | NULL                  | /about       | 1970-01-01 00:00:00    |
| /home              | 2023-10-01 10:00:00   | /home        | 2023-10-01 10:00:00    |
| /contact           | NULL                  | /contact     | 1970-01-01 00:00:00    |
| NULL               | 2023-10-01 10:05:00   | /error       | 2023-10-01 10:05:00    |
| NULL               | 2023-10-01 10:50:00   | /error       | 2023-10-01 10:50:00    |
| /services          | NULL                  | /services    | 1970-01-01 00:00:00    |
| /home              | 2023-10-01 10:45:00   | /home        | 2023-10-01 10:45:00    |
| /services          | 2023-10-01 11:00:00   | /services    | 2023-10-01 11:00:00    |
| NULL               | 2023-10-01 10:35:00   | /error       | 2023-10-01 10:35:00    |

fields 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 fields 命令，保留或删除搜索结果中的字段。

语法

使用以下语法：


field [+|-] <field-list>

index: 可选。

如果使用加号 (+)，则仅保留字段列表中指定的字段。

如果使用减号 (-)，则删除字段列表中指定的所有字段。

默认：+
field list：必需。要保留或删除的字段列表，以逗号分隔。

示例 1：从结果中选择指定字段

此示例展示如何从搜索结果中提取 account_number、firstname 和 lastname 字段。

PPL 查询：


os> source=accounts | fields account_number, firstname, lastname;
fetched rows / total rows = 4/4
+------------------+-------------+------------+
| account_number   | firstname   | lastname   |
|------------------+-------------+------------|
| 1                | Jane        | Doe        |
| 6                | John        | Doe        |
| 13               | Jorge       | Souza      |
| 18               | Juan        | Li         |
+------------------+-------------+------------+

示例 2：从结果中移除指定字段

此示例展示如何从搜索结果中移除 account_number 字段。

PPL 查询：


os> source=accounts | fields account_number, firstname, lastname | fields - account_number ;
fetched rows / total rows = 4/4
+-------------+------------+
| firstname   | lastname   |
|-------------+------------|
| Jane        | Doe        |
| John        | Doe        |
| Jorge       | Souza      |
| Juan        | Li         |
+-------------+------------+

其他示例

source = table
source = table | fields a,b,c
source = table | fields + a,b,c
source = table | fields - b,c
source = table | eval b1 = b | fields - b1,c

嵌套字段示例：


`source = catalog.schema.table1, catalog.schema.table2 | fields A.nested1, B.nested1`
`source = catalog.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields  int_col, struct_col.field1.subfield, struct_col2.field1.subfield`
`source = catalog.schema.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields  int_col, struct_col.field1.subfield, struct_col2.field1.subfield`

flatten 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 flatten 命令，展开以下类型的字段：

struct<?,?>
array<struct<?,?>>

语法

使用以下语法：


flatten <field>

字段：要展平的字段。字段必须是受支持的类型。

架构

col_name	data_type
_time	字符串
bridges	array<struct<length:bigint,name:string>>
city	字符串
coor	struct<alt:bigint,lat:double,long:double>
country	字符串

数据

_time	bridges	city	coor	country
2024-09-13T12:00:00	[{801, Tower Bridge}, {928, London Bridge}]	伦敦	{35, 51.5074, -0.1278}	England
2024-09-13T12:00:00	[{232, Pont Neuf}, {160, Pont Alexandre III}]	Paris	{35, 48.8566, 2.3522}	法国
2024-09-13T12:00:00	[{48, Rialto Bridge}, {11, Bridge of Sighs}]	威尼斯	{2, 45.4408, 12.3155}	意大利
2024-09-13T12:00:00	[{***, Charles Bridge}, {343, Legion Bridge}]	布拉格	{200, 50.0755, 14.4378}	捷克共和国
2024-09-13T12:00:00	[{375, Chain Bridge}, {333, Liberty Bridge}]	布达佩斯	{96, 47.4979, 19.0402}	匈牙利
1990-09-13T 12:00:00	NULL	华沙	NULL	波兰

示例 1：展平结构

此示例说明如何对结构字段进行展平处理。

PPL 查询：


source=table | flatten coor

_time	bridges	city	country	alt	lat	长整数
2024-09-13T12:00:00	[{801, Tower Bridge}, {928, London Bridge}]	伦敦	England	35	51.5074	-0.1278
2024-09-13T12:00:00	[{232, Pont Neuf}, {160, Pont Alexandre III}]	Paris	法国	35	48.8566	2.3522
2024-09-13T12:00:00	[{48, Rialto Bridge}, {11, Bridge of Sighs}]	威尼斯	意大利	2	45.4408	12.3155
2024-09-13T12:00:00	[{516, Charles Bridge}, {343, Legion Bridge}]	布拉格	捷克共和国	200	50.0755	14.4378
2024-09-13T12:00:00	[{375, Chain Bridge}, {333, Liberty Bridge}]	布达佩斯	匈牙利	96	47.4979	19.0402
1990-09-13T 12:00:00	NULL	华沙	波兰	NULL	NULL	NULL

示例 2：展平数组

该示例说明如何对结构字段的数组进行展平处理。

PPL 查询：


source=table | flatten bridges

_time	city	coor	country	length	name
2024-09-13T12:00:00	伦敦	{35, 51.5074, -0.1278}	England	801	塔桥
2024-09-13T12:00:00	伦敦	{35, 51.5074, -0.1278}	England	928	伦敦桥
2024-09-13T12:00:00	Paris	{35, 48.8566, 2.3522}	法国	232	Pont Neuf
2024-09-13T12:00:00	Paris	{35, 48.8566, 2.3522}	法国	160	亚历山大三世桥
2024-09-13T12:00:00	威尼斯	{2, 45.4408, 12.3155}	意大利	48	里亚托桥
2024-09-13T12:00:00	威尼斯	{2, 45.4408, 12.3155}	意大利	11	叹息桥
2024-09-13T12:00:00	布拉格	{200, 50.0755, 14.4378}	捷克共和国	516	查理大桥
2024-09-13T12:00:00	布拉格	{200, 50.0755, 14.4378}	捷克共和国	343	军团大桥
2024-09-13T12:00:00	布达佩斯	{96, 47.4979, 19.0402}	匈牙利	375	链桥
2024-09-13T12:00:00	布达佩斯	{96, 47.4979, 19.0402}	匈牙利	333	自由大桥
1990-09-13T 12:00:00	华沙	NULL	波兰	NULL	NULL

示例 3：展平数组和结构

此示例说明如何对多个字段进行展平处理。

PPL 查询：


source=table | flatten bridges | flatten coor

_time	city	country	length	name	alt	lat	长整数
2024-09-13T12:00:00	伦敦	England	801	塔桥	35	51.5074	-0.1278
2024-09-13T12:00:00	伦敦	England	928	伦敦桥	35	51.5074	-0.1278
2024-09-13T12:00:00	Paris	法国	232	Pont Neuf	35	48.8566	2.3522
2024-09-13T12:00:00	Paris	法国	160	亚历山大三世桥	35	48.8566	2.3522
2024-09-13T12:00:00	威尼斯	意大利	48	里亚托桥	2	45.4408	12.3155
2024-09-13T12:00:00	威尼斯	意大利	11	叹息桥	2	45.4408	12.3155
2024-09-13T12:00:00	布拉格	捷克共和国	516	查理大桥	200	50.0755	14.4378
2024-09-13T12:00:00	布拉格	捷克共和国	343	军团大桥	200	50.0755	14.4378
2024-09-13T12:00:00	布达佩斯	匈牙利	375	链桥	96	47.4979	19.0402
2024-09-13T12:00:00	布达佩斯	匈牙利	333	自由大桥	96	47.4979	19.0402
1990-09-13T 12:00:00	华沙	波兰	NULL	NULL	NULL	NULL	NULL

grok 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

grok 命令使用 grok 模式解析文本字段，并将结果附加到搜索结果中。

语法

使用以下语法：


grok <field> <pattern>

字段

必需。
该字段必须为文本字段。

模式

必需。
用于从给定文本字段提取新字段的 grok 模式。
如果新的字段名称已存在，则会替换原有字段。

Grok 模式

grok 模式用于匹配每个文档的文本字段，以提取新字段。

示例 1：创建新字段

此示例说明如何为每个文档创建新字段 host。host 是 email 字段中接在 @ 之后的主机名。解析 null 字段将返回空字符串。


os> source=accounts | grok email '.+@%{HOSTNAME:host}' | fields email, host ;
fetched rows / total rows = 4/4
+-------------------------+-------------+
| email                   | host        |
|-------------------------+-------------|
| jane_doe@example.com    | example.com |
| arnav_desai@example.net | example.net |
| null                    |             |
| juan_li@example.org     | example.org |
+-------------------------+-------------+

示例 2：覆盖现有字段

此示例说明如何在移除门牌号的情况下覆盖现有 address 字段。


os> source=accounts | grok address '%{NUMBER} %{GREEDYDATA:address}' | fields address ;
fetched rows / total rows = 4/4
+------------------+
| address          |
|------------------|
| Example Lane     |
| Any Street       |
| Main Street      |
| Example Court    |
+------------------+

示例 3：使用 grok 解析日志

此示例说明如何使用 grok 解析原始日志。


os> source=apache | grok message '%{COMMONAPACHELOG}' | fields COMMONAPACHELOG, timestamp, response, bytes ;
fetched rows / total rows = 4/4
+-----------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+---------+
| COMMONAPACHELOG                                                                                                             | timestamp                  | response   | bytes   |
|-----------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+---------|
| 177.95.8.74 - upton5450 [28/Sep/2022:10:15:57 -0700] "HEAD /e-business/mindshare HTTP/1.0" 404 19927                        | 28/Sep/2022:10:15:57 -0700 | 404        | 19927   |
| 127.45.152.6 - pouros8756 [28/Sep/2022:10:15:57 -0700] "GET /architectures/convergence/niches/mindshare HTTP/1.0" 100 28722 | 28/Sep/2022:10:15:57 -0700 | 100        | 28722   |
| *************** - - [28/Sep/2022:10:15:57 -0700] "PATCH /strategize/out-of-the-box HTTP/1.0" 401 27439                      | 28/Sep/2022:10:15:57 -0700 | 401        | 27439   |
| ************** - - [28/Sep/2022:10:15:57 -0700] "POST /users HTTP/1.1" 301 9481                                             | 28/Sep/2022:10:15:57 -0700 | 301        | 9481    |
+-----------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+---------+

限制

grok 命令与 parse 命令具有相同的限制。

head 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 head 命令，返回搜索顺序中可选偏移量之后的前 N 个指定结果。

语法

使用以下语法：


head [<size>] [from <offset>]

<size>

可选整数。
要返回的结果数量。
默认值：10

<offset>

可选 from 之后的整数。
要跳过的结果数。
默认：0

示例 1：获取前 10 个结果

此示例说明如何从账户索引中检索最多 10 个结果。

PPL 查询：


os> source=accounts | fields firstname, age | head;
fetched rows / total rows = 4/4
+-------------+-------+
| firstname   | age   |
|-------------+-------|
| Jane        | 32    |
| John        | 36    |
| Jorge       | 28    |
| Juan        | 33    |
+-------------+-------+

示例 2：获取前 N 个结果

该示例展示账户索引的前 N 个结果。

PPL 查询：


os> source=accounts | fields firstname, age | head 3;
fetched rows / total rows = 3/3
+-------------+-------+
| firstname   | age   |
|-------------+-------|
| Jane        | 32    |
| John        | 36    |
| Jorge       | 28    |
+-------------+-------+

示例 3：获取偏移量 M 之后的前 N 个结果

此示例说明如何从账户索引中跳过 M 个结果后，检索后续的前 N 个结果。

PPL 查询：


os> source=accounts | fields firstname, age | head 3 from 1;
fetched rows / total rows = 3/3
+-------------+-------+
| firstname   | age   |
|-------------+-------|
| John        | 36    |
| Jorge       | 28    |
| Juan        | 33    |
+-------------+-------+

join 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

join 命令允许您根据共同字段合并来自多个来源的数据，从而能够对分布式数据集进行复杂分析并获得更深入的见解

架构

至少有两个索引，otel-v1-apm-span-*（大）和 otel-v1-apm-service-map（小）。

索引的相关字段：

`otel-v1-apm-span-*`

traceId：跟踪的唯一标识符。同一跟踪的所有跨度均共享相同的 traceId。
spanId：追踪中跨度的唯一标识符，在跨度创建时分配。
parentSpanId -此跨度的父跨度的 spanID。如果这是根跨度，则此字段必须为空。
durationInNanos -开始时间和结束时间之间的差异（以纳秒为单位）。（这是latency在用户界面中）
serviceName：跨度所源自的资源。
traceGroup：跟踪的根跨度名称。

`otel-v1-apm-service-map`

serviceName：发出跨度的服务名称。
destination.domain：此客户端所调用服务的服务名称。
destination.resource：此客户端正在调用的跨度名称（API、操作等）。
target.domain：客户端所调用服务的服务名称。
target.resource：客户端正在调用的跨度名称（API、操作等）。
traceGroupName -启动请求链的顶级 span 名称。

要求

支持 join 计算以下内容：

对于每项服务，将跨度索引与服务映射索引关联，以计算不同类型筛选条件下的指标。

此示例查询计算按跟踪组 client_cancel_order 筛选 order 服务时的延迟。


SELECT avg(durationInNanos)
FROM `otel-v1-apm-span-000001` t1
WHERE t1.serviceName = `order`
  AND ((t1.name in
          (SELECT target.resource
           FROM `otel-v1-apm-service-map`
           WHERE serviceName = `order`
             AND traceGroupName = `client_cancel_order`)
        AND t1.parentSpanId != NULL)
       OR (t1.parentSpanId = NULL
           AND t1.name = `client_cancel_order`))
  AND t1.traceId in
    (SELECT traceId
     FROM `otel-v1-apm-span-000001`
     WHERE serviceName = `order`)

迁移到 PPL

join 命令的语法


SEARCH source=<left-table>
| <other piped command>
| [joinType] JOIN
    [leftAlias]
    ON joinCriteria
    <right-table>
| <other piped command>

重写


SEARCH source=otel-v1-apm-span-000001
| WHERE serviceName = 'order'
| JOIN left=t1 right=t2
    ON t1.traceId = t2.traceId AND t2.serviceName = 'order'
    otel-v1-apm-span-000001 -- self inner join
| EVAL s_name = t1.name -- rename to avoid ambiguous
| EVAL s_parentSpanId = t1.parentSpanId -- RENAME command would be better when it is supported
| EVAL s_durationInNanos = t1.durationInNanos 
| FIELDS s_name, s_parentSpanId, s_durationInNanos -- reduce colunms in join
| LEFT JOIN left=s1 right=t3
    ON s_name = t3.target.resource AND t3.serviceName = 'order' AND t3.traceGroupName = 'client_cancel_order'
    otel-v1-apm-service-map
| WHERE (s_parentSpanId IS NOT NULL OR (s_parentSpanId IS NULL AND s_name = 'client_cancel_order'))
| STATS avg(s_durationInNanos) -- no need to add alias if there is no ambiguous

joinType

语法：INNER | LEFT OUTER | CROSS
可选
要执行的联接类型。如果未指定，则默认为 INNER。

leftAlias

语法：left = <leftAlias>
可选
用于左联接侧的子查询别名，以避免命名歧义。

joinCriteria

语法：<expression>
必需
语法以 ON 开头。可以是任何比较表达式。通常，联接标准如 <leftAlias>.<leftField>=<rightAlias>.<rightField> 所示。

例如：l.id = r.id。如果联接标准包含多个条件，则可在每个比较表达式之间指定 AND 和 OR 运算符。例如 l.id = r.id AND l.email = r.email AND (r.age > 65 OR r.age < 18)。

lookup 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 lookup 命令，通过添加或替换查询索引（维度表）中的数据，丰富搜索数据。此命令可让您使用维度表中的值扩展索引的字段。您还可以在满足查找条件时使用此命令追加或替换值。使用静态数据集丰富源数据时，lookup 命令 Join 命令更合适。

语法

使用以下语法：


SEARCH source=<sourceIndex>
| <other piped command>
| LOOKUP <lookupIndex> (<lookupMappingField> [AS <sourceMappingField>])...
    [(REPLACE | APPEND) (<inputField> [AS <outputField>])...]
| <other piped command>

lookupIndex

必需。
查找索引（维度表）的名称。

lookupMappingField

必需。
查找索引中的映射键，类似于右表中的联接键。您可以指定以逗号分隔的多个字段。

sourceMappingField

可选。
默认：< lookupMappingField >。
源查询中的映射键，类似于左侧的联接键。

inputField

可选。
默认：查找索引中可找到匹配值的所有字段。
查找索引中的字段，其中匹配的值将应用于结果输出。您可以指定以逗号分隔的多个字段。

outputField

可选。
默认值：<inputField>。
输出中的字段。您可以指定多个输出字段。如果在源查询中指定现有字段名称，则其值将替换或附加为 inputField 中的匹配值。如果指定新的字段名称，该名称将被添加到结果中。

REPLACE | APPEND

可选。
默认：REPLACE
指定如何处理匹配的值。如果指定 REPLACE，<lookupIndex> 字段中匹配的值会覆盖结果中的值。如果指定 APPEND，<lookupIndex> 字段中匹配的值只会追加到结果中的缺失值之后。

用法

LOOKUP <lookupIndex> id AS cid REPLACE mail AS email
LOOKUP <lookupIndex> name REPLACE mail AS email
LOOKUP <lookupIndex> id AS cid, name APPEND address, mail AS email
LOOKUP <lookupIndex> id

示例

请见以下示例。


SEARCH source=<sourceIndex>
| WHERE orderType = 'Cancelled'
| LOOKUP account_list, mkt_id AS mkt_code REPLACE amount, account_name AS name
| STATS count(mkt_code), avg(amount) BY name


SEARCH source=<sourceIndex>
| DEDUP market_id
| EVAL category=replace(category, "-", ".")
| EVAL category=ltrim(category, "dvp.")
| LOOKUP bounce_category category AS category APPEND classification


SEARCH source=<sourceIndex>
| LOOKUP bounce_category category

parse 命令

parse 命令使用正则表达式解析文本字段，并将结果附加到搜索结果中。

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

语法

使用以下语法：


parse <field> <pattern>

`field`

必需。
该字段必须为文本字段。

`pattern`

必需的字符串。
这是用于从给定文本字段提取新字段的正则表达式模式。
如果新的字段名称已存在，则会替换原有字段。

正则表达式

正则表达式模式用于通过 Java 正则表达式引擎匹配每个文档的整个文本字段。表达式中的每个命名捕获组都将成为新 STRING 字段。

示例 1：创建新字段

该示例说明如何为每个文档创建新字段 host。host 是 email 字段中接在 @ 之后的主机名。解析 null 字段将返回空字符串。

PPL 查询：


os> source=accounts | parse email '.+@(?<host>.+)' | fields email, host ;
fetched rows / total rows = 4/4
+-----------------------+-------------+
| email                 | host        |
|-----------------------+-------------|
| jane_doe@example.com  | example.com |
| john_doe@example.net  | example.net |
| null                  |             |
| juan_li@example.org   | example.org |
+-----------------------+-------------+

示例 2：覆盖现有字段

该示例说明如何在移除门牌号的情况下覆盖现有 address 字段。

PPL 查询：


os> source=accounts | parse address '\d+ (?<address>.+)' | fields address ;
fetched rows / total rows = 4/4
+------------------+
| address          |
|------------------|
| Example Lane     |
| Example Street   |
| Example Avenue   |
| Example Court    |
+------------------+

示例 3：按转换后的解析字段进行筛选和排序

该示例说明如何对 address 字段中大于 500 的街道号码进行排序。

PPL 查询：


os> source=accounts | parse address '(?<streetNumber>\d+) (?<street>.+)' | where cast(streetNumber as int) > 500 | sort num(streetNumber) | fields streetNumber, street ;
fetched rows / total rows = 3/3
+----------------+----------------+
| streetNumber   | street         |
|----------------+----------------|
| ***            | Example Street |
| ***            | Example Avenue |
| 880            | Example Lane   |
+----------------+----------------+

限制

parse 命令存在一些限制：

无法再次解析通过 parse 定义的字段。

以下命令将无法执行：


source=accounts | parse address '\d+ (?<street>.+)' | parse street '\w+ (?<road>\w+)'

通过 parse 定义的字段不能被其他命令覆盖。

where 不会匹配任何文档，因为 street 无法覆盖：
```
source=accounts | parse address '\d+ (?<street>.+)' | eval street='1' | where street='1' ;        
```
parse 使用的文本字段不能被覆盖。

address 无法成功解析，因为 street 已被覆盖：
```
source=accounts | parse address '\d+ (?<street>.+)' | eval address='1' ;        
```
在 stats 命令中使用通过 parse 定义的字段后，无法对其进行筛选或排序。

以下命令中的 where 将无法执行：
```
source=accounts | parse email '.+@(?<host>.+)' | stats avg(age) by host | where host=pyrami.com ;        
```

patterns 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

patterns 命令从文本字段中提取日志模式，并将结果附加到搜索结果中。按日志模式进行分组，便于从海量日志数据中聚合统计信息，从而进行分析和故障排查。

语法

使用以下语法：


patterns [new_field=<new-field-name>] [pattern=<pattern>] <field>

new-field-name

可选字符串。
这是用于提取模式的新字段名称。
默认值为 patterns_field。
如果名称已存在，则会替换原有字段。

模式

可选字符串。
这是用于筛选文本字段中不需要字符的正则表达式模式。
如果缺失，则默认模式为字母数字字符（[a-zA-Z\d]）。

字段

必需。
该字段必须为文本字段。

示例 1：创建新字段

该示例说明如何对每个文档使用 email 中的提取标点符号功能。解析 null 字段将返回空字符串。

PPL 查询：


os> source=accounts | patterns email | fields email, patterns_field ;
fetched rows / total rows = 4/4
+-----------------------+------------------+
| email                 | patterns_field   |
|-----------------------+------------------|
| jane_doe@example.com  | @.               |
| john_doe@example.net  | @.               |
| null                  |                  |
| juan_li@example.org   | @.               |
+-----------------------+------------------+

示例 2：提取日志模式

该示例说明如何使用默认模式从原始日志字段中提取标点符号。

PPL 查询：


os> source=apache | patterns message | fields message, patterns_field ;
fetched rows / total rows = 4/4
+-----------------------------------------------------------------------------------------------------------------------------+---------------------------------+
| message                                                                                                                     | patterns_field                  |
|-----------------------------------------------------------------------------------------------------------------------------+---------------------------------|
| 177.95.8.74 - upton5450 [28/Sep/2022:10:15:57 -0700] "HEAD /e-business/mindshare HTTP/1.0" 404 19927                        | ... -  [//::: -] " /-/ /."      |
| ************ - pouros8756 [28/Sep/2022:10:15:57 -0700] "GET /architectures/convergence/niches/mindshare HTTP/1.0" 100 28722 | ... -  [//::: -] " //// /."     |
| *************** - - [28/Sep/2022:10:15:57 -0700] "PATCH /strategize/out-of-the-box HTTP/1.0" 401 27439                      | ... - - [//::: -] " //--- /."   |
| ************** - - [28/Sep/2022:10:15:57 -0700] "POST /users HTTP/1.1" 301 9481                                             | ... - - [//::: -] " / /."       |
+-----------------------------------------------------------------------------------------------------------------------------+---------------------------------+

示例 3：使用自定义正则表达式模式提取日志模式

该示例说明如何使用用户定义的模式从原始日志字段中提取标点符号。

PPL 查询：


os> source=apache | patterns new_field='no_numbers' pattern='[0-9]' message | fields message, no_numbers ;
fetched rows / total rows = 4/4
+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| message                                                                                                                     | no_numbers                                                                           |
|-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------|
| 177.95.8.74 - upton5450 [28/Sep/2022:10:15:57 -0700] "HEAD /e-business/mindshare HTTP/1.0" 404 19927                        | ... - upton [/Sep/::: -] "HEAD /e-business/mindshare HTTP/."                         |
| 127.45.152.6 - pouros8756 [28/Sep/2022:10:15:57 -0700] "GET /architectures/convergence/niches/mindshare HTTP/1.0" 100 28722 | ... - pouros [/Sep/::: -] "GET /architectures/convergence/niches/mindshare HTTP/."   |
| *************** - - [28/Sep/2022:10:15:57 -0700] "PATCH /strategize/out-of-the-box HTTP/1.0" 401 27439                      | ... - - [/Sep/::: -] "PATCH /strategize/out-of-the-box HTTP/."                       |
| ************** - - [28/Sep/2022:10:15:57 -0700] "POST /users HTTP/1.1" 301 9481                                             | ... - - [/Sep/::: -] "POST /users HTTP/."                                            |
+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

限制

patterns 命令与 parse 命令具有相同的限制。

rare 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 rare 命令查找字段列表中所有字段的最不常见值。

注意

对于每个分组字段值的唯一元组，最多返回 10 个结果。

语法

使用以下语法：


rare [N] <field-list> [by-clause] rare_approx [N] <field-list> [by-clause]

field-list

必需。
以逗号分隔的字段名称列表。

by-clause

可选。
用于对结果进行分组的一个或多个字段。

N

要返回的结果数量。
默认值：10

rare_approx

使用 HyperLogLog++ 算法估计的基数计算稀有 (n) 个字段的近似计数。

示例 1：查找字段中最不常见值

该示例查找所有账户中有关性别的最不常见值。

PPL 查询：


os> source=accounts | rare gender;
os> source=accounts | rare_approx 10 gender;
os> source=accounts | rare_approx gender;
fetched rows / total rows = 2/2
+----------+
| gender   |
|----------|
| F        |
| M        |
+----------+

示例 2：按性别分类查找最不常见值

该示例按性别分类查找所有账户组中有关年龄的最不常见值。

PPL 查询：


os> source=accounts | rare 5 age by gender;
os> source=accounts | rare_approx 5 age by gender;
fetched rows / total rows = 4/4
+----------+-------+
| gender   | age   |
|----------+-------|
| F        | 28    |
| M        | 32    |
| M        | 33    |
| M        | 36    |
+----------+-------+

rename 命令

使用 rename 命令，更改搜索结果中一个或多个字段的名称。

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

语法

使用以下语法：


rename <source-field> AS <target-field>["," <source-field> AS <target-field>]...

source-field

必需。
这是您想要重命名的字段名称。

target-field

必需。
这是您想要重命名的名称。

示例 1：重命名一个字段

此示例说明如何重命名单个字段。

PPL 查询：


os> source=accounts | rename account_number as an | fields an;
fetched rows / total rows = 4/4
+------+
| an   |
|------|
| 1    |
| 6    |
| 13   |
| 18   |
+------+

示例 2：重命名多个字段

此示例说明如何重命名多个字段。

PPL 查询：


os> source=accounts | rename account_number as an, employer as emp | fields an, emp;
fetched rows / total rows = 4/4
+------+---------+
| an   | emp     |
|------+---------|
| 1    | Pyrami  |
| 6    | Netagy  |
| 13   | Quility |
| 18   | null    |
+------+---------+

限制

不支持覆盖现有字段：


source=accounts | grok address '%{NUMBER} %{GREEDYDATA:address}' | fields address

search 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 search 命令，从索引中检索文档。search 命令仅可用作 PPL 查询中的第一个命令。

语法

使用以下语法：


search source=[<remote-cluster>:]<index> [boolean-expression]

搜索

可选。
搜索关键词，可省略。

索引

必需。
搜索命令必须指定从哪个索引进行查询。
对于跨集群搜索，索引名称可以加上前缀 <cluster name>:。

bool-expression

可选。
计算结果为布尔值的任何表达式。

示例 1：提取所有数据

该示例展示从账户索引中提取所有文档。

PPL 查询：


os> source=accounts;
+------------------+-------------+----------------------+-----------+----------+--------+----------------+---------+-------+-----------------------+------------+
| account_number   | firstname   | address              | balance   | gender   | city   | employer       | state   | age   | email                 | lastname   |
|------------------+-------------+----------------------+-----------+----------+--------+----------------+---------+-------+-----------------------+------------|
| 1                | Jorge       | *** Any Lane         | 39225     | M        | Brogan | ExampleCorp    | IL      | 32    | jane_doe@example.com  | Souza      |
| 6                | John        | *** Example Street   | 5686      | M        | Dante  | AnyCorp        | TN      | 36    | john_doe@example.com  | Doe        |
| 13               | Jane        | *** Any Street       | *****     | F        | Nogal  | ExampleCompany | VA      | 28    | null                  | Doe        |
| 18               | Juan        | *** Example Court    | 4180      | M        | Orick  | null           | MD      | 33    | juan_li@example.org   | Li         |
+------------------+-------------+----------------------+-----------+----------+--------+----------------+---------+-------+-----------------------+------------+

示例 2：使用条件提取数据

该示例展示使用从账户索引中提取所有文档。

PPL 查询：


os> SEARCH source=accounts account_number=1 or gender="F";
+------------------+-------------+--------------------+-----------+----------+--------+----------------+---------+-------+-------------------------+------------+
| account_number   | firstname   | address            | balance   | gender   | city   | employer       | state   | age   | email                -  | lastname   |
|------------------+-------------+--------------------+-----------+----------+--------+----------------+---------+-------+-------------------------+------------|
| 1                | Jorge       | *** Any Lane       | *****     | M        | Brogan | ExampleCorp    | IL      | 32    | jorge_souza@example.com | Souza      |
| 13               | Jane        | *** Any Street     | *****     | F        | Nogal  | ExampleCompany | VA      | 28    | null                    | Doe        |
+------------------+-------------+--------------------+-----------+----------+--------+-----------------+---------+-------+------------------------+------------+

sort 命令

使用 sort 命令，按指定字段对搜索结果进行排序。

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

语法

使用以下语法：


sort <[+|-] sort-field>...

[+|-]

可选。
加号 [+] 代表升序， NULL/MISSING 值在前。
减号 [-] 代表降序， NULL/MISSING 值排在最后。
默认：升序排序， NULL/MISSING 值在前。

sort-field

必需。
用于排序的字段。

示例 1：按一个字段排序

该示例说明如何根据年龄字段按升序对文档进行排序。

PPL 查询：


os> source=accounts | sort age | fields account_number, age;
fetched rows / total rows = 4/4
+------------------+-------+
| account_number   | age   |
|------------------+-------|
| 13               | 28    |
| 1                | 32    |
| 18               | 33    |
| 6                | 36    |
+------------------+-------+

示例 2：按一个字段排序并返回所有结果

该示例说明如何根据年龄字段按升序对文档进行排序。

PPL 查询：


os> source=accounts | sort age | fields account_number, age;
fetched rows / total rows = 4/4
+------------------+-------+
| account_number   | age   |
|------------------+-------|
| 13               | 28    |
| 1                | 32    |
| 18               | 33    |
| 6                | 36    |
+------------------+-------+

示例 3：按一个字段降序排序

该示例说明如何根据年龄字段按降序对文档进行排序。

PPL 查询：


os> source=accounts | sort - age | fields account_number, age;
fetched rows / total rows = 4/4
+------------------+-------+
| account_number   | age   |
|------------------+-------|
| 6                | 36    |
| 18               | 33    |
| 1                | 32    |
| 13               | 28    |
+------------------+-------+

示例 4：按多个字段排序

该示例说明如何对文档进行排序，性别字段按升序排序，同时年龄字段按降序排序。

PPL 查询：


os> source=accounts | sort + gender, - age | fields account_number, gender, age;
fetched rows / total rows = 4/4
+------------------+----------+-------+
| account_number   | gender   | age   |
|------------------+----------+-------|
| 13               | F        | 28    |
| 6                | M        | 36    |
| 18               | M        | 33    |
| 1                | M        | 32    |
+------------------+----------+-------+

示例 5：按包含 null 值的字段排序

该示例说明如何按默认选项（升序排序，null 值置于首位）对雇主字段进行排序。结果显示 null 值位于第一行。

PPL 查询：


os> source=accounts | sort employer | fields employer;
fetched rows / total rows = 4/4
+------------+
| employer   |
|------------|
| null       |
| AnyCompany |
| AnyCorp    |
| AnyOrgty   |
+------------+

stats 命令

使用 stats 命令，根据搜索结果计算聚合。

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

NULL/MISSING 值处理

NULL/MISSING 值处理
函数	NULL	缺失
COUNT	未计数	未计数
SUM	忽略	忽略
AVG	忽略	忽略
MAX	忽略	忽略
MIN	忽略	忽略

语法

使用以下语法：


stats <aggregation>... [by-clause]

聚合

必需。
应用于字段的聚合函数。

by-clause

可选。
语法：by [span-expression,] [field,]...
指定对聚合结果进行分组的字段和表达式。by 子句允许您使用字段和表达式对聚合结果进行分组。您可以使用标量函数、聚合函数甚至跨度表达式，将特定字段拆分为间隔相等的存储桶。
默认：如果未指定 <by-clause>，则 stats 命令返回单行，表示对整个结果集进行聚合。

span-expression

可选，最多一个。
语法：span(field_expr, interval_expr)
默认情况下，间隔表达式的单位为自然单位。如果该字段是日期和时间类型字段，并且间隔以单位为 date/time 单位，则可以在间隔表达式中指定单位。
例如，按照 10 年为单位将 age 字段拆分为存储桶，如 span(age, 10) 所示。要将时间戳字段拆分为每小时间隔，请使用 span(timestamp, 1h)。

可用时间单位
跨度单位
毫秒（ms）
秒（s）
分钟（m，区分大小写）
小时（h）
天（d）
周（w）
月（M，区分大小写）
季度（q）
年（y）

聚合函数

`COUNT`

返回 SELECT 语句检索到的行中 expr 出现的次数。

示例：


os> source=accounts | stats count();
fetched rows / total rows = 1/1
+-----------+
| count()   |
|-----------|
| 4         |
+-----------+

`SUM`

使用 SUM(expr)，返回 expr 的总和。

示例


os> source=accounts | stats sum(age) by gender;
fetched rows / total rows = 2/2
+------------+----------+
| sum(age)   | gender   |
|------------+----------|
| 28         | F        |
| 101        | M        |
+------------+----------+

`AVG`

使用 AVG(expr)，返回 expr 的平均值。

示例


os> source=accounts | stats avg(age) by gender;
fetched rows / total rows = 2/2
+--------------------+----------+
| avg(age)           | gender   |
|--------------------+----------|
| 28.0               | F        |
| 33.666666666666664 | M        |
+--------------------+----------+

`MAX`

使用 MAX(expr)，返回 expr 的最大值。

示例


os> source=accounts | stats max(age);
fetched rows / total rows = 1/1
+------------+
| max(age)   |
|------------|
| 36         |
+------------+

`MIN`

使用 MIN(expr)，返回 expr 的最小值。

示例


os> source=accounts | stats min(age);
fetched rows / total rows = 1/1
+------------+
| min(age)   |
|------------|
| 28         |
+------------+

`STDDEV_SAMP`

使用 STDDEV_SAMP(expr)，返回 expr 的样本标准差。

示例：


os> source=accounts | stats stddev_samp(age);
fetched rows / total rows = 1/1
+--------------------+
| stddev_samp(age)   |
|--------------------|
| 3.304037933599835  |
+--------------------+

STDDEV_POP

使用 STDDEV_POP(expr)，返回 expr 的总体标准差。

示例：


os> source=accounts | stats stddev_pop(age);
fetched rows / total rows = 1/1
+--------------------+
| stddev_pop(age)    |
|--------------------|
| 2.**************** |
+--------------------+

TAKE

使用 TAKE(field [, size])，返回字段的原始值。无法保证值的顺序。

字段

必需。
该字段必须为文本字段。

size

可选整数。
应返回值的数量。
默认值为 10。

示例


os> source=accounts | stats take(firstname);
fetched rows / total rows = 1/1
+-----------------------------+
| take(firstname)             |
|-----------------------------|
| [Jane, Mary, Nikki, Juan    |
+-----------------------------+

PERCENTILE 或 PERCENTILE_APPROX

使用 PERCENTILE(expr, percent) 或 PERCENTILE_APPROX(expr, percent)，返回 expr 在指定百分比处的近似百分位数值。

百分比

该数值必须是 0 到 100 之间的常数。

示例


os> source=accounts | stats percentile(age, 90) by gender;
fetched rows / total rows = 2/2
+-----------------------+----------+
| percentile(age, 90)   | gender   |
|-----------------------+----------|
| 28                    | F        |
| 36                    | M        |
+-----------------------+----------+

示例 1：计算事件计数

该示例说明如何计算账户中的事件计数。


os> source=accounts | stats count();
fetched rows / total rows = 1/1
+-----------+
| count()   |
|-----------|
| 4         |
+-----------+

示例 2：计算字段的平均值

该示例说明如何计算所有账户的平均年龄。


os> source=accounts | stats avg(age);
fetched rows / total rows = 1/1
+------------+
| avg(age)   |
|------------|
| 32.25      |
+------------+

示例 3：按组计算字段的平均值

该示例说明如何按性别分组计算所有账户的平均年龄。


os> source=accounts | stats avg(age) by gender;
fetched rows / total rows = 2/2
+--------------------+----------+
| avg(age)           | gender   |
|--------------------+----------|
| 28.0               | F        |
| 33.666666666666664 | M        |
+--------------------+----------+

示例 4：按组计算字段的平均值、总和及计数

该示例说明如何按性别分组计算所有账户的平均年龄、总年龄和事件计数。


os> source=accounts | stats avg(age), sum(age), count() by gender;
fetched rows / total rows = 2/2
+--------------------+------------+-----------+----------+
| avg(age)           | sum(age)   | count()   | gender   |
|--------------------+------------+-----------+----------|
| 28.0               | 28         | 1         | F        |
| 33.666666666666664 | 101        | 3         | M        |
+--------------------+------------+-----------+----------+

示例 5：计算字段的最大值

该示例计算所有账户的最大年龄。


os> source=accounts | stats max(age);
fetched rows / total rows = 1/1
+------------+
| max(age)   |
|------------|
| 36         |
+------------+

示例 6：按组计算字段的最大值和最小值

该示例按性别分组计算所有账户的最大年龄值和最小年龄值。


os> source=accounts | stats max(age), min(age) by gender;
fetched rows / total rows = 2/2
+------------+------------+----------+
| max(age)   | min(age)   | gender   |
|------------+------------+----------|
| 28         | 28         | F        |
| 36         | 32         | M        |
+------------+------------+----------+

示例 7：计算字段的不同计数

要获取字段中不同值的计数，可使用 DISTINCT_COUNT（或 DC）函数而不是 COUNT。该示例计算所有账户中性别字段的总计数和独立计数。


os> source=accounts | stats count(gender), distinct_count(gender);
fetched rows / total rows = 1/1
+-----------------+--------------------------+
| count(gender)   | distinct_count(gender)   |
|-----------------+--------------------------|
| 4               | 2                        |
+-----------------+--------------------------+

示例 8：按跨度计算计数

该示例以 10 年为间隔统计年龄计数。


os> source=accounts | stats count(age) by span(age, 10) as age_span
fetched rows / total rows = 2/2
+--------------+------------+
| count(age)   | age_span   |
|--------------+------------|
| 1            | 20         |
| 3            | 30         |
+--------------+------------+

示例 9：按性别和跨度计算计数

此示例统计按性别分组且年龄跨度为 5 年的记录数。


os> source=accounts | stats count() as cnt by span(age, 5) as age_span, gender
fetched rows / total rows = 3/3
+-------+------------+----------+
| cnt   | age_span   | gender   |
|-------+------------+----------|
| 1     | 25         | F        |
| 2     | 30         | M        |
| 1     | 35         | M        |
+-------+------------+----------+

span 表达式始终作为第一个分组键出现，无论命令中指定的顺序如何。


os> source=accounts | stats count() as cnt by gender, span(age, 5) as age_span
fetched rows / total rows = 3/3
+-------+------------+----------+
| cnt   | age_span   | gender   |
|-------+------------+----------|
| 1     | 25         | F        |
| 2     | 30         | M        |
| 1     | 35         | M        |
+-------+------------+----------+

示例 10：计算计数并获取按性别和跨度划分的电子邮件列表

该示例以 10 年为间隔统计年龄计数，并按性别进行分组，每行获得最多 5 封电子邮件的列表。


os> source=accounts | stats count() as cnt, take(email, 5) by span(age, 5) as age_span, gender
fetched rows / total rows = 3/3
+-------+----------------------------------------------------+------------+----------+
| cnt   | take(email, 5)                                     | age_span   | gender   |
|-------+----------------------------------------------------+------------+----------|
| 1     | []                                                 | 25         | F        |
| 2     | [janedoe@anycompany.com,juanli@examplecompany.org] | 30         | M        |
| 1     | [marymajor@examplecorp.com]                        | 35         | M        |
+-------+----------------------------------------------------+------------+----------+

示例 11：计算字段的百分位数

该示例说明如何计算所有账户的第 90 百分位年龄值。


os> source=accounts | stats percentile(age, 90);
fetched rows / total rows = 1/1
+-----------------------+
| percentile(age, 90)   |
|-----------------------|
| 36                    |
+-----------------------+

示例 12：按组计算字段的百分位数

该示例说明如何按性别分组计算所有账户的第 90 百分位年龄值。


os> source=accounts | stats percentile(age, 90) by gender;
fetched rows / total rows = 2/2
+-----------------------+----------+
| percentile(age, 90)   | gender   |
|-----------------------+----------|
| 28                    | F        |
| 36                    | M        |
+-----------------------+----------+

示例 13：按性别和跨度计算百分位数

该示例按 10 年为间隔，并按性别分组，统计第 90 百分位年龄值。


os> source=accounts | stats percentile(age, 90) as p90 by span(age, 10) as age_span, gender
fetched rows / total rows = 2/2
+-------+------------+----------+
| p90   | age_span   | gender   |
|-------+------------+----------|
| 28    | 20         | F        |
| 36    | 30         | M        |
+-------+------------+----------+


- `source = table | stats avg(a) `
- `source = table | where a < 50 | stats avg(c) `
- `source = table | stats max(c) by b`
- `source = table | stats count(c) by b | head 5`
- `source = table | stats distinct_count(c)`
- `source = table | stats stddev_samp(c)`
- `source = table | stats stddev_pop(c)`
- `source = table | stats percentile(c, 90)`
- `source = table | stats percentile_approx(c, 99)`

带跨度聚合


- `source = table  | stats count(a) by span(a, 10) as a_span`
- `source = table  | stats sum(age) by span(age, 5) as age_span | head 2`
- `source = table  | stats avg(age) by span(age, 20) as age_span, country  | sort - age_span |  head 2`

具有时间窗口跨度的聚合（翻转窗口函数）


- `source = table | stats sum(productsAmount) by span(transactionDate, 1d) as age_date | sort age_date`
- `source = table | stats sum(productsAmount) by span(transactionDate, 1w) as age_date, productId`

按多个级别对聚合分组


- `source = table | stats avg(age) as avg_state_age by country, state | stats avg(avg_state_age) as avg_country_age by country`
- `source = table | stats avg(age) as avg_city_age by country, state, city | eval new_avg_city_age = avg_city_age - 1 | stats avg(new_avg_city_age) as avg_state_age by country, state | where avg_state_age > 18 | stats avg(avg_state_age) as avg_adult_country_age by country`

subquery 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 subquery 命令，在管道处理语言（PPL）语句中执行复杂的嵌套查询。


source=logs | where field in [ subquery source=events | where condition | fields field ]

在此示例中，主搜索（source=logs）按子查询（source=events）的结果进行筛选。

subquery 命令支持用于复杂数据分析的多层嵌套。

嵌套子查询示例


source=logs | where id in [ subquery source=users | where user in [ subquery source=actions | where action="login" | fields user] | fields uid ]

InSubquery 用法

source = outer | where a in [ source = inner | fields b ]
source = outer | where (a) in [ source = inner | fields b ]
source = outer | where (a,b,c) in [ source = inner | fields d,e,f ]
source = outer | where a not in [ source = inner | fields b ]
source = outer | where (a) not in [ source = inner | fields b ]
source = outer | where (a,b,c) not in [ source = inner | fields d,e,f ]
source = outer a in [ source = inner | fields b ]（使用子查询进行搜索筛选）
source = outer a not in [ source = inner | fields b ]（使用子查询进行搜索筛选）
source = outer | where a in [ source = inner1 | where b not in [ source = inner2 | fields c ] | fields b ]（嵌套）
source = table1 | inner join left = l right = r on l.a = r.a AND r.a in [ source = inner | fields d ] | fields l.a, r.a, b, c（作为联接筛选器）

使用 IN-Subquery PPL 的 SQL 迁移示例

TPC-H Q4（带聚合的 in-subquery）


select
  o_orderpriority,
  count(*) as order_count
from
  orders
where
  o_orderdate >= date '1993-07-01'
  and o_orderdate < date '1993-07-01' + interval '3' month
  and o_orderkey in (
    select
      l_orderkey
    from
      lineitem
    where l_commitdate < l_receiptdate
  )
group by
  o_orderpriority
order by
  o_orderpriority

由 PPL InSubquery 查询重写：


source = orders
| where o_orderdate >= "1993-07-01" and o_orderdate < "1993-10-01" and o_orderkey IN
  [ source = lineitem
    | where l_commitdate < l_receiptdate
    | fields l_orderkey
  ]
| stats count(1) as order_count by o_orderpriority
| sort o_orderpriority
| fields o_orderpriority, order_count

TPC-H Q20（嵌套的 in-subquery）


select
  s_name,
  s_address
from
  supplier,
  nation
where
  s_suppkey in (
    select
      ps_suppkey
    from
      partsupp
    where
      ps_partkey in (
        select
          p_partkey
        from
          part
        where
          p_name like 'forest%'
      )
  )
  and s_nationkey = n_nationkey
  and n_name = 'CANADA'
order by
  s_name

由 PPL InSubquery 查询重写：


source = supplier
| where s_suppkey IN [
    source = partsupp
    | where ps_partkey IN [
        source = part
        | where like(p_name, "forest%")
        | fields p_partkey
      ]
    | fields ps_suppkey
  ]
| inner join left=l right=r on s_nationkey = n_nationkey and n_name = 'CANADA'
  nation
| sort s_name

ExistsSubquery 用法

假设：a、b 是表 outer 的字段，c、d 是表 inner 的字段，e、f是表 inner2 的字段。

source = outer | where exists [ source = inner | where a = c ]
source = outer | where not exists [ source = inner | where a = c ]
source = outer | where exists [ source = inner | where a = c and b = d ]
source = outer | where not exists [ source = inner | where a = c and b = d ]
source = outer exists [ source = inner | where a = c ]（使用子查询进行搜索筛选）
source = outer not exists [ source = inner | where a = c ]（使用子查询进行搜索筛选）
source = table as t1 exists [ source = table as t2 | where t1.a = t2.a ]（表别名在 exists 子查询中大有助益）
source = outer | where exists [ source = inner1 | where a = c and exists [ source = inner2 | where c = e ] ]（嵌套）
source = outer | where exists [ source = inner1 | where a = c | where exists [ source = inner2 | where c = e ] ]（嵌套）
source = outer | where exists [ source = inner | where c > 10 ]（不相关的 exists）
source = outer | where not exists [ source = inner | where c > 10 ]（不相关的 exists）
source = outer | where exists [ source = inner ] | eval l = "nonEmpty" | fields l（特殊的不相关 exists）

ScalarSubquery 用法

假设：a、b 是表 outer 的字段，c、d 是表 inner 的字段，e、f是表 nested 的字段。

不相关的标量子查询

在 Select 中：

source = outer | eval m = [ source = inner | stats max(c) ] | fields m, a
source = outer | eval m = [ source = inner | stats max(c) ] + b | fields m, a

在 Where 中：

source = outer | where a > [ source = inner | stats min(c) ] | fields a

在搜索筛选条件中：

source = outer a > [ source = inner | stats min(c) ] | fields a

嵌套的标量子查询

source = outer | where a = [ source = inner | stats max(c) | sort c ] OR b = [ source = inner | where c = 1 | stats min(d) | sort d ]
source = outer | where a = [ source = inner | where c = [ source = nested | stats max(e) by f | sort f ] | stats max(d) by c | sort c | head 1 ]

（关系）子查询

InSubquery、ExistsSubquery 和 ScalarSubquery 都是子查询表达式。但 RelationSubquery 不是子查询表达式，而是子查询执行计划，常用于 Join 子句或 From 子句。

source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ]（右联接侧中的子查询）
source = [ source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ] | stats count(a) by b ] as outer | head 1

其他上下文

InSubquery、ExistsSubquery、和 ScalarSubquery 是 where 子句和搜索筛选条件中常用的子查询表达式。

Where 命令：


| where <boolean expression> | ...

搜索筛选条件：


search source=* <boolean expression> | ...

子查询表达式可用于布尔表达式中：


| where orders.order_id in [ source=returns | where return_reason="damaged" | field order_id ]

orders.order_id in [ source=... ] 是 <boolean expression>。

通常，我们将这种子查询子句命名为 InSubquery 表达式。这是 <boolean expression>。

具有不同联接类型的子查询

使用 ScalarSubquery 的示例：


source=employees
| join source=sales on employees.employee_id = sales.employee_id
| where sales.sale_amount > [ source=targets | where target_met="true" | fields target_value ]

与 InSubquery ExistsSubquery、和 ScalarSubquery，a RelationSubquery 不是子查询表达式。而是子查询计划。


SEARCH source=customer
| FIELDS c_custkey
| LEFT OUTER JOIN left = c, right = o ON c.c_custkey = o.o_custkey
   [
      SEARCH source=orders
      | WHERE o_comment NOT LIKE '%unusual%packages%'
      | FIELDS o_orderkey, o_custkey
   ]
| STATS ...

top 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 top 命令查找字段列表中所有字段的最常见值。

语法

使用以下语法：


top [N] <field-list> [by-clause] top_approx [N] <field-list> [by-clause]

N

要返回的结果数量。
默认值：10

field-list

必需。
以逗号分隔的字段名称列表。

by-clause

可选。
用于对结果进行分组的一个或多个字段。

top_approx

使用 HyperLogLog++ 估计的基数算法对 (n) 个顶部字段的近似计数。

示例 1：查找字段中最常见值

该示例查找所有账户中有关性别的最常见值。

PPL 查询：


os> source=accounts | top gender;
os> source=accounts | top_approx gender;
fetched rows / total rows = 2/2
+----------+
| gender   |
|----------|
| M        |
| F        |
+----------+

示例 2：查找字段中最常见值（限制为 1）

该示例查找所有账户中单一性别的最常见值。

PPL 查询：


os> source=accounts | top_approx 1 gender;
fetched rows / total rows = 1/1
+----------+
| gender   |
|----------|
| M        |
+----------+

示例 3：查找按性别分组的最常见值

该示例查找所有账户按性别分组的最常见的年龄值。

PPL 查询：


os> source=accounts | top 1 age by gender;
os> source=accounts | top_approx 1 age by gender;
fetched rows / total rows = 2/2
+----------+-------+
| gender   | age   |
|----------+-------|
| F        | 28    |
| M        | 32    |
+----------+-------+

trendline 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 trendline 命令，计算字段的移动平均值。

语法

使用以下语法


TRENDLINE [sort <[+|-] sort-field>] SMA(number-of-datapoints, field) [AS alias] [SMA(number-of-datapoints, field) [AS alias]]...

[+|-]

可选。
加号 [+] 代表升序， NULL/MISSING 值在前。
减号 [-] 代表降序， NULL/MISSING 值排在最后。
默认：升序排序， NULL/MISSING 值在前。

sort-field

使用排序时必须指定。
用于排序的字段。

number-of-datapoints

必需。
用于计算移动平均线的数据点数。
必须大于 0。

字段

必需。
应计算移动平均线的字段名称。

别名

可选。
包含移动平均线的生成列名称。

仅支持简单移动平均（SMA）类型。计算方式如下：


f[i]: The value of field 'f' in the i-th data-point
n: The number of data-points in the moving window (period)
t: The current time index

SMA(t) = (1/n) * Σ(f[i]), where i = t-n+1 to t

示例 1：计算温度时间序列的简单移动平均

该示例使用两个数据点计算温度的简单移动平均。

PPL 查询：


os> source=t | trendline sma(2, temperature) as temp_trend;
fetched rows / total rows = 5/5
+-----------+---------+--------------------+----------+
|temperature|device-id|           timestamp|temp_trend|
+-----------+---------+--------------------+----------+
|         12|     1492|2023-04-06 17:07:...|      NULL|
|         12|     1492|2023-04-06 17:07:...|      12.0|
|         13|      256|2023-04-06 17:07:...|      12.5|
|         14|      257|2023-04-06 17:07:...|      13.5|
|         15|      258|2023-04-06 17:07:...|      14.5|
+-----------+---------+--------------------+----------+

示例 2：计算排序后温度时间序列的简单移动平均

该示例使用按 device-id 降序排列的两个和三个数据点，计算温度的两个简单移动平均。

PPL 查询：


os> source=t | trendline sort - device-id sma(2, temperature) as temp_trend_2 sma(3, temperature) as temp_trend_3;
fetched rows / total rows = 5/5
+-----------+---------+--------------------+------------+------------------+
|temperature|device-id|           timestamp|temp_trend_2|      temp_trend_3|
+-----------+---------+--------------------+------------+------------------+
|         15|      258|2023-04-06 17:07:...|        NULL|              NULL|
|         14|      257|2023-04-06 17:07:...|        14.5|              NULL|
|         13|      256|2023-04-06 17:07:...|        13.5|              14.0|
|         12|     1492|2023-04-06 17:07:...|        12.5|              13.0|
|         12|     1492|2023-04-06 17:07:...|        12.0|12.333333333333334|
+-----------+---------+--------------------+------------+------------------+

where 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

where 命令使用 bool-expression 筛选搜索结果。仅当 bool-expression 的计算结果为 true 时，才返回结果。

语法

使用以下语法：


where <boolean-expression>

bool-expression

可选。
任何可计算为布尔值的表达式。

示例 1：使用条件筛选结果集

该示例说明如何从账户索引中提取满足特定条件的文档。

PPL 查询：


os> source=accounts | where account_number=1 or gender="F" | fields account_number, gender;
fetched rows / total rows = 2/2
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 13               | F        |
+------------------+----------+

其他示例

具有逻辑条件的筛选器

source = table | where c = 'test' AND a = 1 | fields a,b,c
source = table | where c != 'test' OR a > 1 | fields a,b,c | head 1
source = table | where c = 'test' NOT a > 1 | fields a,b,c
source = table | where a = 1 | fields a,b,c
source = table | where a >= 1 | fields a,b,c
source = table | where a < 1 | fields a,b,c
source = table | where b != 'test' | fields a,b,c
source = table | where c = 'test' | fields a,b,c | head 3
source = table | where ispresent(b)
source = table | where isnull(coalesce(a, b)) | fields a,b,c | head 3
source = table | where isempty(a)
source = table | where isblank(a)
source = table | where case(length(a) > 6, 'True' else 'False') = 'True'
source = table | where a between 1 and 4：注意：这将返回 >= 1 和 <= 4，即 [1, 4]
source = table | where b not between '2024-09-10' and '2025-09-10'：注意：这将返回 b >= '********** '和 b <=' 2025-09-10 '
source = table | where cidrmatch(ip, '***********/24')
source = table | where cidrmatch(ipv6, '2003:db8::/32')


source = table | eval status_category =
    case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
    else 'Incorrect HTTP status code')
    | where case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
    else 'Incorrect HTTP status code'
    ) = 'Incorrect HTTP status code'


source = table
    | eval factor = case(a > 15, a - 14, isnull(b), a - 7, a < 3, a + 1 else 1)
    | where case(factor = 2, 'even', factor = 4, 'even', factor = 6, 'even', factor = 8, 'even' else 'odd') = 'even'
    |  stats count() by factor

字段摘要

注意

要查看哪些Amazon数据源集成支持此 PPL 命令，请参阅。命令

使用 fieldsummary 命令，计算每个字段的基本统计数据（计数、唯一值计数、最小值、最大值、平均值、标准差、平均值），并确定每个字段的数据类型。此命令可与任何前置管道配合使用，并会将其纳入处理范围。

语法

使用下面的语法。对于 CloudWatch 日志用例，查询中仅支持一个字段。


... | fieldsummary <field-list> (nulls=true/false)

includefields

需收集统计信息的全部列表，最终整合为统一的结果集。

null

可选。
如果设置为 true，则在聚合计算中包含 null 值（对于数值类型，将 null 替换为零）。

示例 1

PPL 查询：


os> source = t | where status_code != 200 | fieldsummary includefields= status_code nulls=true
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| Fields           | COUNT       | COUNT_DISTINCT    |  MIN  |  MAX   |  AVG   |  MEAN   |        STDDEV       | NUlls | TYPEOF |
|------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| "status_code"    |      2      |         2         | 301   |   403  |  352.0 |  352.0  |  72.12489168102785  |  0    | "int"  |
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|

示例 2

PPL 查询：


os> source = t | fieldsummary includefields= id, status_code, request_path nulls=true
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| Fields           | COUNT       | COUNT_DISTINCT    |  MIN  |  MAX   |  AVG   |  MEAN   |        STDDEV       | NUlls | TYPEOF |
|------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
|       "id"       |      6      |         6         | 1     |   6    |  3.5   |   3.5  |  1.8708286933869707  |  0    | "int"  |
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| "status_code"    |      4      |         3         | 200   |   403  |  184.0 |  184.0  |  161.16699413961905 |  2    | "int"  |
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| "request_path"   |      2      |         2         | /about| /home  |  0.0    |  0.0     |      0            |  2    |"string"|
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|

expand 命令

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

使用 expand 命令，对 Array<Any> 或 Map<Any> 类型的字段进行展平处理，为每个元素或键值对生成单独的行。

语法

使用以下语法：


expand <field> [As alias]

字段

要扩展（展开）的字段。
字段必须是受支持的类型。

别名

可选。
要使用的名称，而非原始字段名称。

使用指南

expand 命令为指定数组或映射字段中的每个元素生成一行，其中：

数组元素单独成行。
映射键值对拆分为不同的行，每个键值都表示为一行。
提供别名时，展开后的值将显示在别名下，而非原始字段名称下。

可将此命令与其他命令（例如 stats、eval 和 parse）结合使用，用于在扩展后对数据进行操作或提取。

示例

source = table | expand employee | stats max(salary) as max by state, company
source = table | expand employee as worker | stats max(salary) as max by state, company
source = table | expand employee as worker | eval bonus = salary * 3 | fields worker, bonus
source = table | expand employee | parse description '(?<email>.+@.+)' | fields employee, email
source = table | eval array=json_array(1, 2, 3) | expand array as uid | fields name, occupation, uid
source = table | expand multi_valueA as multiA | expand multi_valueB as multiB

可将 expand 命令与其他命令（例如 eval、stats 等）结合使用。使用多个 expand 命令将创建每个复合数组或映射中所有内部元素的笛卡尔乘积。

有效的 SQL 下推查询

expand 命令通过 LATERAL VIEW explode 转换为等效的 SQL 操作，从而能在 SQL 查询层级高效地展开数组或映射。


SELECT customer exploded_productId
FROM table
LATERAL VIEW explode(productId) AS exploded_productId

explode 命令提供以下功能：

这是一种返回新列的列运算。
会为已分解列中的每个元素创建一个新行。
作为分解字段的一部分，内部空值将被忽略（没有一行代表 created/exploded 空值）。

PPL 函数

PPL 条件函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

ISNULL

描述：如果字段为 null，则 isnull(field) 返回 true。

参数类型：

任何受支持的数据类型。

返回类型：

BOOLEAN

示例：


os> source=accounts | eval result = isnull(employer) | fields result, employer, firstname
fetched rows / total rows = 4/4
+----------+-------------+-------------+
| result   | employer    | firstname   |
|----------+-------------+-------------|
| False    | AnyCompany  | Mary        |
| False    | ExampleCorp | Jane        |
| False    | ExampleOrg  | Nikki       |
| True     | null        | Juan        |
+----------+-------------+-------------+

ISNOTNULL

描述：如果字段不为 null，则 isnotnull(field) 返回 true。

参数类型：

任何受支持的数据类型。

返回类型：

BOOLEAN

示例：


os> source=accounts | where not isnotnull(employer) | fields account_number, employer
fetched rows / total rows = 1/1
+------------------+------------+
| account_number   | employer   |
|------------------+------------|
| 18               | null       |
+------------------+------------+

EXISTS

示例：


os> source=accounts | where exists(email) | fields account_number, email
fetched rows / total rows = 1/1

IFNULL

描述：如果 field1 为 null，则 ifnull(field1, field2) 返回 field2。

参数类型：

任何受支持的数据类型。
如果两个参数的类型不同，则函数将无法通过语义检查。

返回类型：

任何

示例：


os> source=accounts | eval result = ifnull(employer, 'default') | fields result, employer, firstname
fetched rows / total rows = 4/4
+------------+------------+-------------+
| result     | employer   | firstname   |
|------------+------------+-------------|
| AnyCompany | AnyCompany | Mary        |
| ExampleCorp| ExampleCorp| Jane        |
| ExampleOrg | ExampleOrg | Nikki       |
| default    | null       | Juan        |
+------------+------------+-------------+

NULLIF

描述：如果两个参数相同，则 nullif(field1, field2) 返回 null，否则返回 field1。

参数类型：

任何受支持的数据类型。
如果两个参数的类型不同，则函数将无法通过语义检查。

返回类型：

任何

示例：


os> source=accounts | eval result = nullif(employer, 'AnyCompany') | fields result, employer, firstname
fetched rows / total rows = 4/4
+----------------+----------------+-------------+
| result         | employer       | firstname   |
|----------------+----------------+-------------|
| null           | AnyCompany     | Mary        |
| ExampleCorp    | ExampleCorp    | Jane        |
| ExampleOrg     | ExampleOrg     | Nikki       |
| null           | null           | Juan        |
+----------------+----------------+-------------+

IF

描述：如果条件为 true，则 if(condition, expr1, expr2) 返回 expr1，否则返回 expr2。

参数类型：

任何受支持的数据类型。
如果两个参数的类型不同，则函数将无法通过语义检查。

返回类型：

任何

示例：


os> source=accounts | eval result = if(true, firstname, lastname) | fields result, firstname, lastname
fetched rows / total rows = 4/4
+----------+-------------+----------+
| result   | firstname | lastname   |
|----------+-------------+----------|
| Jane     | Jane      | Doe        |
| Mary     | Mary      | Major      |
| Pat      | Pat       | Candella   |
| Dale     | Jorge     | Souza      |
+----------+-----------+------------+

os> source=accounts | eval result = if(false, firstname, lastname) | fields result, firstname, lastname
fetched rows / total rows = 4/4
+----------+-------------+------------+
| result   | firstname   | lastname   |
|----------+-------------+------------|
| Doe      | Jane        | Doe        |
| Major    | Mary        | Major      |
| Candella | Pat         | Candella   |
| Souza    | Jorge       | Souza      |
+----------+-------------+------------+

os> source=accounts | eval is_vip = if(age > 30 AND isnotnull(employer), true, false) | fields is_vip, firstname, lastname
fetched rows / total rows = 4/4
+----------+-------------+------------+
| is_vip   | firstname   | lastname   |
|----------+-------------+------------|
| True     | Jane        | Doe        |
| True     | Mary        | Major      |
| False    | Pat         | Candella   |
| False    | Jorge       | Souza      |
+----------+-------------+------------+

PPL 加密哈希函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

MD5

MD5 计算 MD5 摘要并以 32 个字符的十六进制字符串形式返回该值。

用法：md5('hello')

参数类型：

string

返回类型：

string

示例：


os> source=people | eval `MD5('hello')` = MD5('hello') | fields `MD5('hello')`
fetched rows / total rows = 1/1
+----------------------------------+
| MD5('hello')                     |
|----------------------------------|
| <32 character hex string>        |
+----------------------------------+

SHA1

SHA1 返回 SHA-1 的十六进制字符串结果。

用法：sha1('hello')

参数类型：

string

返回类型：

string

示例：


os> source=people | eval `SHA1('hello')` = SHA1('hello') | fields `SHA1('hello')`
fetched rows / total rows = 1/1
+------------------------------------------+
| SHA1('hello')                            |
|------------------------------------------|
| <40-character SHA-1 hash result>         |
+------------------------------------------+

SHA2

SHA2 返回 SHA-2 系列哈希函数（SHA-224、SHA-256、SHA-384 和 SHA-512）的十六进制字符串结果。numBits 表示结果所需的位长度，其值必须为 224、256、384、512

用法：

sha2('hello',256)
sha2('hello',512)

参数类型：

STRING、INTEGER

返回类型：

string

示例：


os> source=people | eval `SHA2('hello',256)` = SHA2('hello',256) | fields `SHA2('hello',256)`
fetched rows / total rows = 1/1
+------------------------------------------------------------------+
| SHA2('hello',256)                                                |
|------------------------------------------------------------------|
| <64-character SHA-256 hash result>                               |
+------------------------------------------------------------------+

os> source=people | eval `SHA2('hello',512)` = SHA2('hello',512) | fields `SHA2('hello',512)`
fetched rows / total rows = 1/1
+------------------------------------------------------------------+
| SHA2('hello',512)                                                |                                                                |
|------------------------------------------------------------------|
| <128-character SHA-512 hash result>                              |
+------------------------------------------------------------------+

PPL 日期和时间函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`DAY`

用法：DAY(date) 提取日期的月份天数，范围为 1 至 31。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAYOFMONTH、DAY_OF_MONTH

示例：


os> source=people | eval `DAY(DATE('2020-08-26'))` = DAY(DATE('2020-08-26')) | fields `DAY(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------+
| DAY(DATE('2020-08-26'))   |
|---------------------------|
| 26                        |
+---------------------------+

`DAYOFMONTH`

用法：DAYOFMONTH(date) 提取日期的月份天数，范围为 1 至 31。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAY、DAY_OF_MONTH

示例：


os> source=people | eval `DAYOFMONTH(DATE('2020-08-26'))` = DAYOFMONTH(DATE('2020-08-26')) | fields `DAYOFMONTH(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+----------------------------------+
| DAYOFMONTH(DATE('2020-08-26'))   |
|----------------------------------|
| 26                               |
+----------------------------------+

`DAY_OF_MONTH`

用法：DAY_OF_MONTH(DATE) 提取日期的月份天数，范围为 1 至 31。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAY、DAYOFMONTH

示例：


os> source=people | eval `DAY_OF_MONTH(DATE('2020-08-26'))` = DAY_OF_MONTH(DATE('2020-08-26')) | fields `DAY_OF_MONTH(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+------------------------------------+
| DAY_OF_MONTH(DATE('2020-08-26'))   |
|------------------------------------|
| 26                                 |
+------------------------------------+

`DAYOFWEEK`

用法：DAYOFWEEK(DATE) 返回日期的工作日索引（1 = 星期日，2 = 星期一，...，7 = 星期六）。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAY_OF_WEEK

示例：


os> source=people | eval `DAYOFWEEK(DATE('2020-08-26'))` = DAYOFWEEK(DATE('2020-08-26')) | fields `DAYOFWEEK(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------------+
| DAYOFWEEK(DATE('2020-08-26'))   |
|---------------------------------|
| 4                               |
+---------------------------------+

`DAY_OF_WEEK`

用法：DAY_OF_WEEK(DATE) 返回日期的工作日索引（1 = 星期日，2 = 星期一，...，7 = 星期六）。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAYOFWEEK

示例：


os> source=people | eval `DAY_OF_WEEK(DATE('2020-08-26'))` = DAY_OF_WEEK(DATE('2020-08-26')) | fields `DAY_OF_WEEK(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-----------------------------------+
| DAY_OF_WEEK(DATE('2020-08-26'))   |
|-----------------------------------|
| 4                                 |
+-----------------------------------+

`DAYOFYEAR`

用法：DAYOFYEAR(DATE) 返回日期的年份天数，范围为 1 至 366。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAY_OF_YEAR

示例：


os> source=people | eval `DAYOFYEAR(DATE('2020-08-26'))` = DAYOFYEAR(DATE('2020-08-26')) | fields `DAYOFYEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------------+
| DAYOFYEAR(DATE('2020-08-26'))   |
|---------------------------------|
| 239                             |
+---------------------------------+

`DAY_OF_YEAR`

用法：DAY_OF_YEAR(DATE) 返回日期的年份天数，范围为 1 至 366。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：DAYOFYEAR

示例：


os> source=people | eval `DAY_OF_YEAR(DATE('2020-08-26'))` = DAY_OF_YEAR(DATE('2020-08-26')) | fields `DAY_OF_YEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-----------------------------------+
| DAY_OF_YEAR(DATE('2020-08-26'))   |
|-----------------------------------|
| 239                               |
+-----------------------------------+

`DAYNAME`

用法：DAYNAME(DATE) 返回日期的星期名称，包括星期一、星期二、星期三、星期五、星期六和星期日。

参数类型：STRING/DATE/TIMESTAMP

返回类型：STRING

示例：


os> source=people | eval `DAYNAME(DATE('2020-08-26'))` = DAYNAME(DATE('2020-08-26')) | fields `DAYNAME(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-------------------------------+
| DAYNAME(DATE('2020-08-26'))   |
|-------------------------------|
| Wednesday                     |
+-------------------------------+

`FROM_UNIXTIME`

用法：FROM_UNIXTIME 返回参数的表示形式，该参数以时间戳或字符串值的形式给出。此函数对 UNIX_TIMESTAMP 函数执行反向转换。

如果提供第二个参数，则 FROM_UNIXTIME 将使用该参数对结果进行格式化，使其类似于 DATE_FORMAT 函数的输出效果。

如果时间戳超出 1970-01-01 00:00:00 至 3001-01-18 23:59:59.999999（0 至 32536771199.999999 epoch 时间）的范围，则该函数返回 NULL。

参数类型：DOUBLE、STRING

返回类型映射：

DOUBLE -> TIMESTAMP

DOUBLE, STRING -> STRING

示例：


os> source=people | eval `FROM_UNIXTIME(1220249547)` = FROM_UNIXTIME(1220249547) | fields `FROM_UNIXTIME(1220249547)`
fetched rows / total rows = 1/1
+-----------------------------+
| FROM_UNIXTIME(1220249547)   |
|-----------------------------|
| 2008-09-01 06:12:27         |
+-----------------------------+

os> source=people | eval `FROM_UNIXTIME(1220249547, 'HH:mm:ss')` = FROM_UNIXTIME(1220249547, 'HH:mm:ss') | fields `FROM_UNIXTIME(1220249547, 'HH:mm:ss')`
fetched rows / total rows = 1/1
+-----------------------------------------+
| FROM_UNIXTIME(1220249547, 'HH:mm:ss')   |
|-----------------------------------------|
| 06:12:27                                |
+-----------------------------------------+

`HOUR`

用法：HOUR(TIME) 提取时间的小时值。

与标准时间不同，此函数中的时间值范围可大于 23。因此，HOUR(TIME) 的返回值可大于 23。

参数类型：STRING/TIME/TIMESTAMP

返回类型：INTEGER

同义词：HOUR_OF_DAY

示例：


os> source=people | eval `HOUR(TIME('01:02:03'))` = HOUR(TIME('01:02:03')) | fields `HOUR(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+--------------------------+
| HOUR(TIME('01:02:03'))   |
|--------------------------|
| 1                        |
+--------------------------+

`HOUR_OF_DAY`

用法：HOUR_OF_DAY(TIME) 提取给定时间的小时值。

与标准时间不同，此函数中的时间值范围可大于 23。因此，HOUR_OF_DAY(TIME) 的返回值可大于 23。

参数类型：STRING/TIME/TIMESTAMP

返回类型：INTEGER

同义词：HOUR

示例：


os> source=people | eval `HOUR_OF_DAY(TIME('01:02:03'))` = HOUR_OF_DAY(TIME('01:02:03')) | fields `HOUR_OF_DAY(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+---------------------------------+
| HOUR_OF_DAY(TIME('01:02:03'))   |
|---------------------------------|
| 1                               |
+---------------------------------+

`LAST_DAY`

用法：LAST_DAY 返回指定日期参数对应月份的最后一天，作为 DATE 值。

参数类型：DATE/STRING/TIMESTAMP/TIME

返回类型：DATE

示例：


os> source=people | eval `last_day('2023-02-06')` = last_day('2023-02-06') | fields `last_day('2023-02-06')`
fetched rows / total rows = 1/1
+--------------------------+
| last_day('2023-02-06')   |
|--------------------------|
| 2023-02-28               |
+--------------------------+

`LOCALTIMESTAMP`

用法：LOCALTIMESTAMP() 是 NOW() 的同义词。

示例：


> source=people | eval `LOCALTIMESTAMP()` = LOCALTIMESTAMP() | fields `LOCALTIMESTAMP()`
fetched rows / total rows = 1/1
+---------------------+
| LOCALTIMESTAMP()    |
|---------------------|
| 2022-08-02 15:54:19 |
+---------------------+

`LOCALTIME`

用法：LOCALTIME() 是 NOW() 的同义词。

示例：


> source=people | eval `LOCALTIME()` = LOCALTIME() | fields `LOCALTIME()`
fetched rows / total rows = 1/1
+---------------------+
| LOCALTIME()         |
|---------------------|
| 2022-08-02 15:54:19 |
+---------------------+

`MAKE_DATE`

用法：MAKE_DATE 根据给定的年、月、日值返回日期值。所有参数均四舍五入为整数。

规格：1。MAKE_DATE(INTEGER, INTEGER, INTEGER) -> DATE

参数类型：INTEGER、INTEGER、INTEGER

返回类型：DATE

示例：


os> source=people | eval `MAKE_DATE(1945, 5, 9)` = MAKEDATE(1945, 5, 9) | fields `MAKEDATE(1945, 5, 9)`
fetched rows / total rows = 1/1
+------------------------+
| MAKEDATE(1945, 5, 9)   |
|------------------------|
| 1945-05-09             |
+------------------------+

`MINUTE`

用法：MINUTE(TIME) 返回给定时间的分钟部分，以 0 到 59 范围内的整数形式表示。

参数类型：STRING/TIME/TIMESTAMP

返回类型：INTEGER

同义词：MINUTE_OF_HOUR

示例：


os> source=people | eval `MINUTE(TIME('01:02:03'))` =  MINUTE(TIME('01:02:03')) | fields `MINUTE(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+----------------------------+
| MINUTE(TIME('01:02:03'))   |
|----------------------------|
| 2                          |
+----------------------------+

`MINUTE_OF_HOUR`

用法：MINUTE_OF_HOUR(TIME) 返回给定时间的分钟部分，以 0 到 59 范围内的整数形式表示。

参数类型：STRING/TIME/TIMESTAMP

返回类型：INTEGER

同义词：MINUTE

示例：


os> source=people | eval `MINUTE_OF_HOUR(TIME('01:02:03'))` =  MINUTE_OF_HOUR(TIME('01:02:03')) | fields `MINUTE_OF_HOUR(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+------------------------------------+
| MINUTE_OF_HOUR(TIME('01:02:03'))   |
|------------------------------------|
| 2                                  |
+------------------------------------+

`MONTH`

用法：MONTH(DATE) 返回给定日期的月份，以整数形式表示，范围为 1 到 12（其中 1 代表一月，12 代表十二月）。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：MONTH_OF_YEAR

示例：


os> source=people | eval `MONTH(DATE('2020-08-26'))` =  MONTH(DATE('2020-08-26')) | fields `MONTH(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-----------------------------+
| MONTH(DATE('2020-08-26'))   |
|-----------------------------|
| 8                           |
+-----------------------------+

`MONTHNAME`

用法：MONTHNAME(DATE) 返回给定日期的月份，以整数形式表示，范围为 1 到 12（其中 1 代表一月，12 代表十二月）。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：MONTH_OF_YEAR

示例：


os> source=people | eval `MONTHNAME(DATE('2020-08-26'))` = MONTHNAME(DATE('2020-08-26')) | fields `MONTHNAME(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------------+
| MONTHNAME(DATE('2020-08-26'))   |
|---------------------------------|
| August                          |
+---------------------------------+

`MONTH_OF_YEAR`

用法：MONTH_OF_YEAR(DATE) 返回给定日期的月份，以整数形式表示，范围为 1 到 12（其中 1 代表一月，12 代表十二月）。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

同义词：MONTH

示例：


os> source=people | eval `MONTH_OF_YEAR(DATE('2020-08-26'))` =  MONTH_OF_YEAR(DATE('2020-08-26')) | fields `MONTH_OF_YEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-------------------------------------+
| MONTH_OF_YEAR(DATE('2020-08-26'))   |
|-------------------------------------|
| 8                                   |
+-------------------------------------+

`NOW`

用法：NOW以 'YYYY-MM-DD hh: mm: s TIMESTAMP s' 格式返回当前日期和时间。该值采用集群时区表示。

注意

NOW() 返回表示语句何时开始执行的常量时间。这与 SYSDATE() 不同，后者返回精确的执行时间。

返回类型：TIMESTAMP

规范：NOW() -> TIMESTAMP

示例：


os> source=people | eval `value_1` = NOW(), `value_2` = NOW() | fields `value_1`, `value_2`
fetched rows / total rows = 1/1
+---------------------+---------------------+
| value_1             | value_2             |
|---------------------+---------------------|
| 2022-08-02 15:39:05 | 2022-08-02 15:39:05 |
+---------------------+---------------------+

`QUARTER`

用法：QUARTER(DATE) 返回给定日期的季度，以整数形式表示，范围为 1 到 4。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

示例：


os> source=people | eval `QUARTER(DATE('2020-08-26'))` = QUARTER(DATE('2020-08-26')) | fields `QUARTER(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-------------------------------+
| QUARTER(DATE('2020-08-26'))   |
|-------------------------------|
| 3                             |
+-------------------------------+

`SECOND`

用法：SECOND(TIME) 返回给定时间的秒钟，以整数形式表示，范围为 0 到 59。

参数类型：STRING/TIME/TIMESTAMP

返回类型：INTEGER

同义词：SECOND_OF_MINUTE

示例：


os> source=people | eval `SECOND(TIME('01:02:03'))` = SECOND(TIME('01:02:03')) | fields `SECOND(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+----------------------------+
| SECOND(TIME('01:02:03'))   |
|----------------------------|
| 3                          |
+----------------------------+

`SECOND_OF_MINUTE`

用法：SECOND_OF_MINUTE(TIME) 返回给定时间的秒钟，以整数形式表示，范围为 0 到 59。

参数类型：STRING/TIME/TIMESTAMP

返回类型：INTEGER

同义词：SECOND

示例：


os> source=people | eval `SECOND_OF_MINUTE(TIME('01:02:03'))` = SECOND_OF_MINUTE(TIME('01:02:03')) | fields `SECOND_OF_MINUTE(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+--------------------------------------+
| SECOND_OF_MINUTE(TIME('01:02:03'))   |
|--------------------------------------|
| 3                                    |
+--------------------------------------+

`SUBDATE`

用法：SUBDATE(DATE, DAYS) 从给定日期中减去第二个参数（例如 DATE 或 DAYS）。

参数类型：DATE/TIMESTAMP、LONG

返回类型映射：(DATE, LONG) -> DATE

反义词：ADDDATE

示例：


os> source=people | eval `'2008-01-02' - 31d` = SUBDATE(DATE('2008-01-02'), 31), `'2020-08-26' - 1` = SUBDATE(DATE('2020-08-26'), 1), `ts '2020-08-26 01:01:01' - 1` = SUBDATE(TIMESTAMP('2020-08-26 01:01:01'), 1) | fields `'2008-01-02' - 31d`, `'2020-08-26' - 1`, `ts '2020-08-26 01:01:01' - 1`
fetched rows / total rows = 1/1
+----------------------+--------------------+--------------------------------+
| '2008-01-02' - 31d   | '2020-08-26' - 1   | ts '2020-08-26 01:01:01' - 1   |
|----------------------+--------------------+--------------------------------|
| 2007-12-02 00:00:00  | 2020-08-25         | 2020-08-25 01:01:01            |
+----------------------+--------------------+--------------------------------+

`SYSDATE`

用法：以 'YYYY-MM-DD hh: mm: ss.nnnnnn' 格式的TIMESTAMP值SYSDATE()返回当前日期和时间。

SYSDATE() 返回其执行的精确时间。这与 NOW() 不同，后者返回常量时间，指示语句开始执行的时间。

可选参数类型：INTEGER（0 到 6）：指定返回值中秒数小数部分的位数。

返回类型：TIMESTAMP

示例：


os> source=people | eval `SYSDATE()` = SYSDATE() | fields `SYSDATE()`
fetched rows / total rows = 1/1
+----------------------------+
| SYSDATE()                  |
|----------------------------|
| 2022-08-02 15:39:05.123456 |
+----------------------------+

`TIMESTAMP`

用法：TIMESTAMP(EXPR) 构造以输入字符串 expr 作为时间戳的时间戳类型。

使用单个参数，TIMESTAMP(expr) 根据输入构造时间戳。如果 expr 是字符串，则将其解释为时间戳。对于非字符串参数，该函数使用 UTC 时区将 expr 转换为时间戳。如果 expr 是 TIME 值，该函数在转换前应用今日日期。

与两个参数结合使用时，TIMESTAMP(expr1, expr2) 将时间表达式（expr2）添加到日期或时间戳表达式（expr1）中，并将结果作为时间戳值返回。

参数类型：STRING/DATE/TIME/TIMESTAMP

返回类型映射：

(STRING/DATE/TIME/TIMESTAMP)-> 时间戳

(STRING/DATE/TIME/TIMESTAMP, STRING/DATE/TIME/TIMESTAMP)-> 时间戳

示例：


os> source=people | eval `TIMESTAMP('2020-08-26 13:49:00')` = TIMESTAMP('2020-08-26 13:49:00'), `TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42'))` = TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42')) | fields `TIMESTAMP('2020-08-26 13:49:00')`, `TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42'))`
fetched rows / total rows = 1/1
+------------------------------------+------------------------------------------------------+
| TIMESTAMP('2020-08-26 13:49:00')   | TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42'))   |
|------------------------------------+------------------------------------------------------|
| 2020-08-26 13:49:00                | 2020-08-27 02:04:42                                  |
+------------------------------------+------------------------------------------------------+

`UNIX_TIMESTAMP`

用法：UNIX_TIMESTAMP 将给定的日期参数转换为 Unix 时间（自 epoch（始于 1970 年初）以来的秒数）。如果未提供参数，则返回当前的 Unix 时间。

日期参数可以是 DATE、TIMESTAMP 字符串或以下格式之一的数字：YYMMDD、YYMMDDhhmmss、YYYYMMDD 或 YYYYMMDDhhmmss。如果参数包含时间分量，则可以选择包含秒的分数部分。

如果参数的格式无效或超出 1970-01-01 00:00:00 至 3001-01-18 23:59:59.999999（0 至 32536771199.999999 epoch 时间）的范围，则该函数返回 NULL。

该函数接受 DATE、TIMESTAMP、或 DOUBLE 作为参数类型，或不接受任何参数。该函数始终返回代表 Unix 时间戳的 DOUBLE 值。

要进行反向转换，可使用 FROM_UNIXTIME 函数。

参数类型：<NONE>/DOUBLE/DATE/TIMESTAMP

返回类型：DOUBLE

示例：


os> source=people | eval `UNIX_TIMESTAMP(double)` = UNIX_TIMESTAMP(20771122143845), `UNIX_TIMESTAMP(timestamp)` = UNIX_TIMESTAMP(TIMESTAMP('1996-11-15 17:05:42')) | fields `UNIX_TIMESTAMP(double)`, `UNIX_TIMESTAMP(timestamp)`
fetched rows / total rows = 1/1
+--------------------------+-----------------------------+
| UNIX_TIMESTAMP(double)   | UNIX_TIMESTAMP(timestamp)   |
|--------------------------+-----------------------------|
| 3404817525.0             | 848077542.0                 |
+--------------------------+-----------------------------+

`WEEK`

用法：WEEK(DATE) 返回给定日期的星期序号。

参数类型：DATE/TIMESTAMP/STRING

返回类型：INTEGER

同义词：WEEK_OF_YEAR

示例：


os> source=people | eval `WEEK(DATE('2008-02-20'))` = WEEK(DATE('2008-02-20')) | fields `WEEK(DATE('2008-02-20'))`
fetched rows / total rows = 1/1
+----------------------------+
| WEEK(DATE('2008-02-20'))   |
|----------------------------|
| 8                          |
+----------------------------+

`WEEKDAY`

用法：WEEKDAY(DATE) 返回日期的工作日索引（0 = 星期一，1 = 星期二，...，6 = 星期日）。

这与 dayofweek 函数类似，但每天返回不同的索引。

参数类型：STRING/DATE/TIME/TIMESTAMP

返回类型：INTEGER

示例：


os> source=people | eval `weekday(DATE('2020-08-26'))` = weekday(DATE('2020-08-26')) | eval `weekday(DATE('2020-08-27'))` = weekday(DATE('2020-08-27')) | fields `weekday(DATE('2020-08-26'))`, `weekday(DATE('2020-08-27'))`
fetched rows / total rows = 1/1
+-------------------------------+-------------------------------+
| weekday(DATE('2020-08-26'))   | weekday(DATE('2020-08-27'))   |
|-------------------------------+-------------------------------|
| 2                             | 3                             |
+-------------------------------+-------------------------------+

`WEEK_OF_YEAR`

用法：WEEK_OF_YEAR(DATE) 返回给定日期的星期序号。

参数类型：DATE/TIMESTAMP/STRING

返回类型：INTEGER

同义词：WEEK

示例：


os> source=people | eval `WEEK_OF_YEAR(DATE('2008-02-20'))` = WEEK(DATE('2008-02-20'))| fields `WEEK_OF_YEAR(DATE('2008-02-20'))`
fetched rows / total rows = 1/1
+------------------------------------+
| WEEK_OF_YEAR(DATE('2008-02-20'))   |
|------------------------------------|
| 8                                  |
+------------------------------------+

`YEAR`

用法：YEAR(DATE) 返回日期的年份，范围为 1000 到 9999，或者返回 0 表示“零”日期。

参数类型：STRING/DATE/TIMESTAMP

返回类型：INTEGER

示例：


os> source=people | eval `YEAR(DATE('2020-08-26'))` = YEAR(DATE('2020-08-26')) | fields `YEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+----------------------------+
| YEAR(DATE('2020-08-26'))   |
|----------------------------|
| 2020                       |
+----------------------------+

`DATE_ADD`

用法：DATE_ADD(date, INTERVAL expr unit) 将指定的时间间隔添加到给定日期。

参数类型：DATE、INTERVAL

返回类型：DATE

反义词：DATE_SUB

示例：


os> source=people | eval `'2020-08-26' + 1d` = DATE_ADD(DATE('2020-08-26'), INTERVAL 1 DAY) | fields `'2020-08-26' + 1d`
fetched rows / total rows = 1/1
+---------------------+
| '2020-08-26' + 1d   |
|---------------------|
| 2020-08-27          |
+---------------------+

`DATE_SUB`

用法：DATE_SUB(date, INTERVAL expr unit) 从日期中减去间隔 expr。

参数类型：DATE、INTERVAL

返回类型：DATE

反义词：DATE_ADD

示例：


os> source=people | eval `'2008-01-02' - 31d` = DATE_SUB(DATE('2008-01-02'), INTERVAL 31 DAY) | fields `'2008-01-02' - 31d`
fetched rows / total rows = 1/1
+---------------------+
| '2008-01-02' - 31d  |
|---------------------|
| 2007-12-02          |
+---------------------+

`TIMESTAMPADD`

用法：为给定日期添加指定时间间隔后，返回 TIMESTAMP 值。

论点：

间隔：INTERVAL（SECOND、MINUTE、HOUR、DAY、WEEK、MONTH、QUARTER、YEAR）

整数：INTEGER

日期：DATE、TIMESTAMP 或 STRING

如果您提供 STRING 作为日期参数，请将其格式化为有效的 TIMESTAMP。该函数会自动将 DATE 参数转换为 TIMESTAMP。

示例：


os> source=people | eval `TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00')` = TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00') | eval `TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00')` = TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00') | fields `TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00')`, `TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00')`
fetched rows / total rows = 1/1
+----------------------------------------------+--------------------------------------------------+
| TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00') | TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00') |
|----------------------------------------------+--------------------------------------------------|
| 2000-01-18 00:00:00                          | 1999-10-01 00:00:00                              |
+----------------------------------------------+--------------------------------------------------+

`TIMESTAMPDIFF`

用法：以指定的间隔单位TIMESTAMPDIFF(interval, start, end)返回开始和结束 date/times 之间的差。

论点：

间隔：INTERVAL（SECOND、MINUTE、HOUR、DAY、WEEK、MONTH、QUARTER、YEAR）
开始：DATE、TIMESTAMP 或 STRING
结束：DATE、TIMESTAMP 或 STRING

该函数会在适当情况下自动将参数转换为 TIMESTAMP。将 STRING 参数格式化为有效的 TIMESTAMP。

示例：


os> source=people | eval `TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00')` = TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00') | eval `TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00'))` = TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00')) | fields `TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00')`, `TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00'))`
fetched rows / total rows = 1/1
+-------------------------------------------------------------------+-------------------------------------------------------------------------------------------+
| TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00') | TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00')) |
|-------------------------------------------------------------------+-------------------------------------------------------------------------------------------|
| 4                                                                 | -23                                                                                       |
+-------------------------------------------------------------------+-------------------------------------------------------------------------------------------+

`UTC_TIMESTAMP`

用法：UTC_TIMESTAMP 返回当前的 UTC 时间戳，以值表示，格式为“YYYY-MM-DD hh:mm:ss”。

返回类型：TIMESTAMP

规范：UTC_TIMESTAMP() -> TIMESTAMP

示例：


> source=people | eval `UTC_TIMESTAMP()` = UTC_TIMESTAMP() | fields `UTC_TIMESTAMP()`
fetched rows / total rows = 1/1
+---------------------+
| UTC_TIMESTAMP()     |
|---------------------|
| 2022-10-03 17:54:28 |
+---------------------+

`CURRENT_TIMEZONE`

用法：CURRENT_TIMEZONE 返回当前的本地时区。

返回类型：STRING

示例：


> source=people | eval `CURRENT_TIMEZONE()` = CURRENT_TIMEZONE() | fields `CURRENT_TIMEZONE()`
fetched rows / total rows = 1/1
+------------------------+
| CURRENT_TIMEZONE()     |
|------------------------|
| America/Chicago        |
+------------------------+

PPL 表达式

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

表达式，尤其是值表达式，返回标量值。表达式具有不同的类型和形式。例如，存在作为原子表达式的字面值，以及基于这些字面值构建的算术表达式、谓词表达式和函数表达式。您可以在不同子句中使用表达式，例如在 Filter 和 Stats 命令中使用算术表达式。

运算符

算术表达式是由数字字面值和二进制算术运算符组成的表达式，如下所示：

+：加。
-：减。
*：乘。
/：除（对于整数，结果为舍去小数部分的整数）
%：模运算（仅适用于整数；结果为除法的余数）

优先级

使用圆括号控制算术运算符的优先顺序。否则，优先级较高的运算符会首先执行。

类型转换

查找运算符签名时会执行隐式类型转换。例如，整数 + 实数符合签名 +(double,double) 的要求，其结果为实数。此规则也适用于函数调用。

不同类型算术表达式的示例：


os> source=accounts | where age > (25 + 5) | fields age ;
fetched rows / total rows = 3/3
+-------+
| age   |
|-------|
| 32    |
| 36    |
| 33    |
+-------+

谓词运算符

谓词运算符是计算结果为 true 的表达式。MISSING 和 NULL 值比较遵循以下规则：

MISSING 值仅等于 MISSING 值，且小于其他值。
NULL 值等于 NULL 值，大于 MISSING 值，但小于所有其他值。

运算符

谓词运算符
Name	说明
`>`	大于运算符
>=	大于或等于运算符
`<`	小于运算符
`!=`	不等于运算符
`<=`	小于或等于运算符
`=`	等于运算符
`LIKE`	简单模式匹配
`IN`	NULL 值测试
`AND`	AND 运算符
`OR`	OR 运算符
`XOR`	XOR 运算符
`NOT`	非 NULL 值测试

您可以比较日期时间。比较不同的日期时间类型（例如 DATE 和 TIME）时，两者均转换为 DATETIME。以下规则适用于转换：

TIME 适用于今日日期。
DATE 解释为在午夜。

基本谓词运算符

比较运算符的示例：


os> source=accounts | where age > 33 | fields age ;
fetched rows / total rows = 1/1
+-------+
| age   |
|-------|
| 36    |
+-------+

`IN`

值列表中 IN 运算符测试字段的示例：


os> source=accounts | where age in (32, 33) | fields age ;
fetched rows / total rows = 2/2
+-------+
| age   |
|-------|
| 32    |
| 33    |
+-------+

`OR`

OR 运算符的示例：


os> source=accounts | where age = 32 OR age = 33 | fields age ;
fetched rows / total rows = 2/2
+-------+
| age   |
|-------|
| 32    |
| 33    |
+-------+

`NOT`

NOT 运算符的示例：


os> source=accounts | where age not in (32, 33) | fields age ;
fetched rows / total rows = 2/2
+-------+
| age   |
|-------|
| 36    |
| 28    |
+-------+

PPL IP 地址函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`CIDRMATCH`

用法：CIDRMATCH(ip, cidr) 检查指定的 IP 地址是否在给定的 cidr 范围内。

参数类型：

STRING、STRING
返回类型：BOOLEAN

示例：


os> source=ips | where cidrmatch(ip, '***********/24') | fields ip
fetched rows / total rows = 1/1
+--------------+
| ip           |
|--------------|
| ***********  |
+--------------+

os> source=ipsv6 | where cidrmatch(ip, '2003:db8::/32') | fields ip
fetched rows / total rows = 1/1
+-----------------------------------------+
| ip                                      |
|-----------------------------------------|
| 2003:0db8:****:****:****:****:****:0000 |
+-----------------------------------------+

注意

ip可以是 IPv4 或 IPv6地址。
cidr可以是一个 IPv4 或一个 IPv6方块。
ip并且cidr必须两者兼 IPv4 而有之 IPv6。
ip 和 cidr 必须同时满足有效且非空/非 null。

PPL JSON 函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`JSON`

用法：json(value) 评估字符串是否可以解析为 JSON 格式。如果是有效 JSON，则该函数返回原始字符串；如果无效，则返回 null。

参数类型：STRING

返回类型：STRING/NULL。有效 JSON 对象格式的 STRING 表达式。

示例：


os> source=people | eval `valid_json()` = json('[1,2,3,{"f1":1,"f2":[5,6]},4]') | fields valid_json
fetched rows / total rows = 1/1
+---------------------------------+
| valid_json                      |
+---------------------------------+
| [1,2,3,{"f1":1,"f2":[5,6]},4]   |
+---------------------------------+

os> source=people | eval `invalid_json()` = json('{"invalid": "json"') | fields invalid_json
fetched rows / total rows = 1/1
+----------------+
| invalid_json   |
+----------------+
| null           |
+----------------+

`JSON_OBJECT`

用法：json_object(<key>, <value>[, <key>, <value>]...) 返回由键值对组成的 JSON 对象。

参数类型：

<key> 必须为 STRING。
<value> 可以是任何数据类型。

返回类型：JSON_OBJECT。有效 JSON 对象的 StructType 表达式。

示例：


os> source=people | eval result = json_object('key', 123.45) | fields result
fetched rows / total rows = 1/1
+------------------+
| result           |
+------------------+
| {"key":123.45}   |
+------------------+

os> source=people | eval result = json_object('outer', json_object('inner', 123.45)) | fields result
fetched rows / total rows = 1/1
+------------------------------+
| result                       |
+------------------------------+
| {"outer":{"inner":123.45}}   |
+------------------------------+

`JSON_ARRAY`

用法：json_array(<value>...) 使用值列表创建 JSON 数组。

参数类型：<value> 可以是任何类型的值，例如字符串、数字或布尔值。

返回类型：ARRAY。任何受支持数据类型的数组，用于表示有效的 JSON 数组。

示例：


os> source=people | eval `json_array` = json_array(1, 2, 0, -1, 1.1, -0.11)
fetched rows / total rows = 1/1
+------------------------------+
| json_array                   |
+------------------------------+
| [1.0,2.0,0.0,-1.0,1.1,-0.11] |
+------------------------------+

os> source=people | eval `json_array_object` = json_object("array", json_array(1, 2, 0, -1, 1.1, -0.11))
fetched rows / total rows = 1/1
+----------------------------------------+
| json_array_object                      |
+----------------------------------------+
| {"array":[1.0,2.0,0.0,-1.0,1.1,-0.11]} |
+----------------------------------------+

`TO_JSON_STRING`

用法：to_json_string(jsonObject) 返回包含给定 json 对象值的 JSON 字符串。

参数类型：JSON_OBJECT

返回类型：STRING

示例：


os> source=people | eval `json_string` = to_json_string(json_array(1, 2, 0, -1, 1.1, -0.11)) | fields json_string
fetched rows / total rows = 1/1
+--------------------------------+
| json_string                    |
+--------------------------------+
| [1.0,2.0,0.0,-1.0,1.1,-0.11]   |
+--------------------------------+

os> source=people | eval `json_string` = to_json_string(json_object('key', 123.45)) | fields json_string
fetched rows / total rows = 1/1
+-----------------+
| json_string     |
+-----------------+
| {'key', 123.45} |
+-----------------+

`ARRAY_LENGTH`

用法：array_length(jsonArray) 返回最外层数组中的元素个数。

参数类型：ARRAY。ARRAY 或 JSON_ARRAY 对象。

返回类型：INTEGER

示例：


os> source=people | eval `json_array` = json_array_length(json_array(1,2,3,4)), `empty_array` = json_array_length(json_array())
fetched rows / total rows = 1/1
+--------------+---------------+
| json_array   | empty_array   |
+--------------+---------------+
| 4            | 0             |
+--------------+---------------+

`JSON_EXTRACT`

用法：json_extract(jsonStr, path) 根据指定的 JSON 路径从 JSON 字符串中提取 JSON 对象。如果输入 JSON 字符串无效，则该函数返回 null。

参数类型：STRING、STRING

返回类型：STRING

有效 JSON 对象格式的 STRING 表达式。
如果 JSON 无效，则返回 NULL。

示例：


os> source=people | eval `json_extract('{"a":"b"}', '$.a')` = json_extract('{"a":"b"}', '$a')
fetched rows / total rows = 1/1
+----------------------------------+
| json_extract('{"a":"b"}', 'a')   |
+----------------------------------+
| b                                |
+----------------------------------+

os> source=people | eval `json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[1].b')` = json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[1].b')
fetched rows / total rows = 1/1
+-----------------------------------------------------------+
| json_extract('{"a":[{"b":1.0},{"b":2.0}]}', '$.a[1].b')   |
+-----------------------------------------------------------+
| 2.0                                                       |
+-----------------------------------------------------------+

os> source=people | eval `json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[*].b')` = json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[*].b')
fetched rows / total rows = 1/1
+-----------------------------------------------------------+
| json_extract('{"a":[{"b":1.0},{"b":2.0}]}', '$.a[*].b')   |
+-----------------------------------------------------------+
| [1.0,2.0]                                                 |
+-----------------------------------------------------------+

os> source=people | eval `invalid_json` = json_extract('{"invalid": "json"')
fetched rows / total rows = 1/1
+----------------+
| invalid_json   |
+----------------+
| null           |
+----------------+

`JSON_KEYS`

用法：json_keys(jsonStr) 以数组形式返回最外层 JSON 对象的所有键。

参数类型：STRING。有效 JSON 对象格式的 STRING 表达式。

返回类型：ARRAY[STRING]。对于任何其他有效的 JSON 字符串、空字符串或无效的 JSON，该函数均返回 NULL。

示例：


os> source=people | eval `keys` = json_keys('{"f1":"abc","f2":{"f3":"a","f4":"b"}}')
fetched rows / total rows = 1/1
+------------+
| keus       |
+------------+
| [f1, f2]   |
+------------+

os> source=people | eval `keys` = json_keys('[1,2,3,{"f1":1,"f2":[5,6]},4]')
fetched rows / total rows = 1/1
+--------+
| keys   |
+--------+
| null   |
+--------+

`JSON_VALID`

用法：json_valid(jsonStr) 评估 JSON 字符串是否使用有效的 JSON 语法并返回 TRUE 或 FALSE。

参数类型：STRING

返回类型：BOOLEAN

示例：


os> source=people | eval `valid_json` = json_valid('[1,2,3,4]'), `invalid_json` = json_valid('{"invalid": "json"') | feilds `valid_json`, `invalid_json`
fetched rows / total rows = 1/1
+--------------+----------------+
| valid_json   | invalid_json   |
+--------------+----------------+
| True         | False          |
+--------------+----------------+

os> source=accounts | where json_valid('[1,2,3,4]') and isnull(email) | fields account_number, email
fetched rows / total rows = 1/1
+------------------+---------+
| account_number   | email   |
|------------------+---------|
| 13               | null    |
+------------------+---------+

PPL Lambda 函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`EXISTS`

用法：exists(array, lambda) 评估 Lambda 谓词是否适用于数组中的一个或多个元素。

参数类型：ARRAY、LAMBDA

返回类型：BOOLEAN。如果数组中至少有一个元素满足 Lambda 谓词，则返回 TRUE，否则返回 FALSE。

示例：


 os> source=people | eval array = json_array(1, -1, 2), result = exists(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| true      |
+-----------+

os> source=people | eval array = json_array(-1, -3, -2), result = exists(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| false     |
+-----------+

`FILTER`

用法：filter(array, lambda) 使用给定的 Lambda 函数筛选输入数组。

参数类型：ARRAY、LAMBDA

返回类型：ARRAY。ARRAY，包含输入数组中满足 lambda 谓词的所有元素。

示例：


 os> source=people | eval array = json_array(1, -1, 2), result = filter(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| [1, 2]    |
+-----------+

os> source=people | eval array = json_array(-1, -3, -2), result = filter(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| []        |
+-----------+

`TRANSFORM`

用法：transform(array, lambda) 使用 Lambda 转换函数转换数组中的元素。如果使用二进制 Lambda 函数，则第二个参数表示元素的索引。这与函数式编程中的 map 类似。

参数类型：ARRAY、LAMBDA

返回类型：ARRAY。ARRAY，包含将 lambda 转换函数应用于输入数组中每个元素的结果。

示例：


os> source=people | eval array = json_array(1, 2, 3), result = transform(array, x -> x + 1) | fields result
fetched rows / total rows = 1/1
+--------------+
| result       |
+--------------+
| [2, 3, 4]    |
+--------------+

os> source=people | eval array = json_array(1, 2, 3), result = transform(array, (x, i) -> x + i) | fields result
fetched rows / total rows = 1/1
+--------------+
| result       |
+--------------+
| [1, 3, 5]    |
+--------------+

`REDUCE`

用法：reduce(array, start, merge_lambda, finish_lambda) 通过应用 lambda 函数将数组简化为单个值。该函数将 merge_lambda 应用于起始值和所有数组元素，然后将 finish_lambda 应用于结果。

参数类型：ARRAY、ANY、LAMBDA、LAMBDA

返回类型：ANY。将 Lambda 函数应用于起始值和输入数组后的最终结果。

示例：


 os> source=people | eval array = json_array(1, 2, 3), result = reduce(array, 0, (acc, x) -> acc + x) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| 6         |
+-----------+

os> source=people | eval array = json_array(1, 2, 3), result = reduce(array, 10, (acc, x) -> acc + x) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| 16        |
+-----------+

os> source=people | eval array = json_array(1, 2, 3), result = reduce(array, 0, (acc, x) -> acc + x, acc -> acc * 10) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| 60        |
+-----------+

PPL 数学函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`ABS`

用法：ABS(x) 计算 x 的绝对值。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：INTEGER/LONG/FLOAT/DOUBLE

示例：


os> source=people | eval `ABS(-1)` = ABS(-1) | fields `ABS(-1)`
fetched rows / total rows = 1/1
+-----------+
| ABS(-1)   |
|-----------|
| 1         |
+-----------+

`ACOS`

用法：ACOS(x) 计算 x 的反余弦值。如果 x 不在 -1 到 1 的范围内，则返回 NULL。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `ACOS(0)` = ACOS(0) | fields `ACOS(0)`
fetched rows / total rows = 1/1
+--------------------+
| ACOS(0)            |
|--------------------|
| 1.5707963267948966 |
+--------------------+

`ASIN`

用法：asin(x) 计算 x 的反正弦值。如果 x 不在 -1 到 1 的范围内，则返回 NULL。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `ASIN(0)` = ASIN(0) | fields `ASIN(0)`
fetched rows / total rows = 1/1
+-----------+
| ASIN(0)   |
|-----------|
| 0.0       |
+-----------+

`ATAN`

用法：ATAN(x) 计算 x 的反正切值。atan(y, x) 计算 y/x 的反正切值，唯一的不同在于两个参数的符号决定结果的象限位置。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `ATAN(2)` = ATAN(2), `ATAN(2, 3)` = ATAN(2, 3) | fields `ATAN(2)`, `ATAN(2, 3)`
fetched rows / total rows = 1/1
+--------------------+--------------------+
| ATAN(2)            | ATAN(2, 3)         |
|--------------------+--------------------|
| 1.1071487177940904 | 0.5880026035475675 |
+--------------------+--------------------+

`ATAN2`

用法：ATAN2(y, x) 计算 y/x 的反正切值，唯一的不同在于两个参数的符号决定结果的象限位置。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `ATAN2(2, 3)` = ATAN2(2, 3) | fields `ATAN2(2, 3)`
fetched rows / total rows = 1/1
+--------------------+
| ATAN2(2, 3)        |
|--------------------|
| 0.5880026035475675 |
+--------------------+

`CBRT`

用法：CBRT 计算数字的立方根。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE：

INTEGER/LONG/FLOAT/DOUBLE-> 双打

示例：


opensearchsql> source=location | eval `CBRT(8)` = CBRT(8), `CBRT(9.261)` = CBRT(9.261), `CBRT(-27)` = CBRT(-27) | fields `CBRT(8)`, `CBRT(9.261)`, `CBRT(-27)`;
fetched rows / total rows = 2/2
+-----------+---------------+-------------+
| CBRT(8)   | CBRT(9.261)   | CBRT(-27)   |
|-----------+---------------+-------------|
| 2.0       | 2.1           | -3.0        |
| 2.0       | 2.1           | -3.0        |
+-----------+---------------+-------------+

`CEIL`

用法：CEILING 函数的别名。CEILING(T) 取值上限为 T。

限制：仅当 IEEE 754 双精度类型在存储时显示十进制数时，CEILING 才会按预期工作。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：LONG

示例：


os> source=people | eval `CEILING(0)` = CEILING(0), `CEILING(50.00005)` = CEILING(50.00005), `CEILING(-50.00005)` = CEILING(-50.00005) | fields `CEILING(0)`, `CEILING(50.00005)`, `CEILING(-50.00005)`
fetched rows / total rows = 1/1
+--------------+---------------------+----------------------+
| CEILING(0)   | CEILING(50.00005)   | CEILING(-50.00005)   |
|--------------+---------------------+----------------------|
| 0            | 51                  | -50                  |
+--------------+---------------------+----------------------+

os> source=people | eval `CEILING(3147483647.12345)` = CEILING(3147483647.12345), `CEILING(113147483647.12345)` = CEILING(113147483647.12345), `CEILING(3147483647.00001)` = CEILING(3147483647.00001) | fields `CEILING(3147483647.12345)`, `CEILING(113147483647.12345)`, `CEILING(3147483647.00001)`
fetched rows / total rows = 1/1
+-----------------------------+-------------------------------+-----------------------------+
| CEILING(3147483647.12345)   | CEILING(113147483647.12345)   | CEILING(3147483647.00001)   |
|-----------------------------+-------------------------------+-----------------------------|
| 3147483648                  | 113147483648                  | 3147483648                  |
+-----------------------------+-------------------------------+-----------------------------+

`CONV`

用法：CONV(x, a, b) 将数字 x 从 a 基数转换为 b 基数。

参数类型：x：STRING，a：INTEGER，b：INTEGER

返回类型：STRING

示例：


os> source=people | eval `CONV('12', 10, 16)` = CONV('12', 10, 16), `CONV('2C', 16, 10)` = CONV('2C', 16, 10), `CONV(12, 10, 2)` = CONV(12, 10, 2), `CONV(1111, 2, 10)` = CONV(1111, 2, 10) | fields `CONV('12', 10, 16)`, `CONV('2C', 16, 10)`, `CONV(12, 10, 2)`, `CONV(1111, 2, 10)`
fetched rows / total rows = 1/1
+----------------------+----------------------+-------------------+---------------------+
| CONV('12', 10, 16)   | CONV('2C', 16, 10)   | CONV(12, 10, 2)   | CONV(1111, 2, 10)   |
|----------------------+----------------------+-------------------+---------------------|
| c                    | 44                   | 1100              | 15                  |
+----------------------+----------------------+-------------------+---------------------+

`COS`

用法：COS(x) 计算 x 的余弦值，其中 x 的单位是弧度。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `COS(0)` = COS(0) | fields `COS(0)`
fetched rows / total rows = 1/1
+----------+
| COS(0)   |
|----------|
| 1.0      |
+----------+

`COT`

用法：COT(x) 计算 x 的余切值。如果 x 等于 0，则返回 out-of-range错误。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `COT(1)` = COT(1) | fields `COT(1)`
fetched rows / total rows = 1/1
+--------------------+
| COT(1)             |
|--------------------|
| 0.6420926159343306 |
+--------------------+

`CRC32`

用法：CRC32 计算循环冗余校验值，并返回 32 位无符号值。

参数类型：STRING

返回类型：LONG

示例：


os> source=people | eval `CRC32('MySQL')` = CRC32('MySQL') | fields `CRC32('MySQL')`
fetched rows / total rows = 1/1
+------------------+
| CRC32('MySQL')   |
|------------------|
| 3259397556       |
+------------------+

`DEGREES`

用法：DEGREES(x) 将 x 的单位从弧度转换为度。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `DEGREES(1.57)` = DEGREES(1.57) | fields `DEGREES(1.57)`
fetched rows / total rows  = 1/1
+-------------------+
| DEGREES(1.57)     |
|-------------------|
| 89.95437383553924 |
+-------------------+

`E`

用法：E() 返回欧拉数字。

返回类型：DOUBLE

示例：


os> source=people | eval `E()` = E() | fields `E()`
fetched rows / total rows = 1/1
+-------------------+
| E()               |
|-------------------|
| 2.718281828459045 |
+-------------------+

`EXP`

用法：EXP(x) 返回 e 的 x 次方。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `EXP(2)` = EXP(2) | fields `EXP(2)`
fetched rows / total rows = 1/1
+------------------+
| EXP(2)           |
|------------------|
| 7.38905609893065 |
+------------------+

`FLOOR`

用法：FLOOR(T) 取值下限为 T。

限制：仅当 IEEE 754 双精度类型在存储时显示十进制数时，FLOOR 才会按预期工作。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：LONG

示例：


os> source=people | eval `FLOOR(0)` = FLOOR(0), `FLOOR(50.00005)` = FLOOR(50.00005), `FLOOR(-50.00005)` = FLOOR(-50.00005) | fields `FLOOR(0)`, `FLOOR(50.00005)`, `FLOOR(-50.00005)`
fetched rows / total rows = 1/1
+------------+-------------------+--------------------+
| FLOOR(0)   | FLOOR(50.00005)   | FLOOR(-50.00005)   |
|------------+-------------------+--------------------|
| 0          | 50                | -51                |
+------------+-------------------+--------------------+

os> source=people | eval `FLOOR(3147483647.12345)` = FLOOR(3147483647.12345), `FLOOR(113147483647.12345)` = FLOOR(113147483647.12345), `FLOOR(3147483647.00001)` = FLOOR(3147483647.00001) | fields `FLOOR(3147483647.12345)`, `FLOOR(113147483647.12345)`, `FLOOR(3147483647.00001)`
fetched rows / total rows = 1/1
+---------------------------+-----------------------------+---------------------------+
| FLOOR(3147483647.12345)   | FLOOR(113147483647.12345)   | FLOOR(3147483647.00001)   |
|---------------------------+-----------------------------+---------------------------|
| 3147483647                | 113147483647                | 3147483647                |
+---------------------------+-----------------------------+---------------------------+

os> source=people | eval `FLOOR(282474973688888.022)` = FLOOR(282474973688888.022), `FLOOR(9223372036854775807.022)` = FLOOR(9223372036854775807.022), `FLOOR(9223372036854775807.0000001)` = FLOOR(9223372036854775807.0000001) | fields `FLOOR(282474973688888.022)`, `FLOOR(9223372036854775807.022)`, `FLOOR(9223372036854775807.0000001)`
fetched rows / total rows = 1/1
+------------------------------+----------------------------------+--------------------------------------+
| FLOOR(282474973688888.022)   | FLOOR(9223372036854775807.022)   | FLOOR(9223372036854775807.0000001)   |
|------------------------------+----------------------------------+--------------------------------------|
| 282474973688888              | 9223372036854775807              | 9223372036854775807                  |
+------------------------------+----------------------------------+--------------------------------------+

`LN`

用法：LN(x) 返回 x 的自然对数。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `LN(2)` = LN(2) | fields `LN(2)`
fetched rows / total rows = 1/1
+--------------------+
| LN(2)              |
|--------------------|
| 0.6931471805599453 |
+--------------------+

`LOG`

用法：LOG(x) 返回 x 的自然对数，即以 e 为底 x 的对数。log(B, x) 等同于 log(x)/log(B)。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `LOG(2)` = LOG(2), `LOG(2, 8)` = LOG(2, 8) | fields `LOG(2)`, `LOG(2, 8)`
fetched rows / total rows = 1/1
+--------------------+-------------+
| LOG(2)             | LOG(2, 8)   |
|--------------------+-------------|
| 0.6931471805599453 | 3.0         |
+--------------------+-------------+

`LOG2`

用法：LOG2(x) 等同于 log(x)/log(2)。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `LOG2(8)` = LOG2(8) | fields `LOG2(8)`
fetched rows / total rows = 1/1
+-----------+
| LOG2(8)   |
|-----------|
| 3.0       |
+-----------+

`LOG10`

用法：LOG10(x) 等同于 log(x)/log(10)。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `LOG10(100)` = LOG10(100) | fields `LOG10(100)`
fetched rows / total rows = 1/1
+--------------+
| LOG10(100)   |
|--------------|
| 2.0          |
+--------------+

`MOD`

用法：MOD(n, m) 计算数字 n 除以 m 的余数。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：如果 m 为非零值，则在 n 和 m 的类型之间使用宽类型。如果 m 等于 0，则返回 NULL。

示例：


os> source=people | eval `MOD(3, 2)` = MOD(3, 2), `MOD(3.1, 2)` = MOD(3.1, 2) | fields `MOD(3, 2)`, `MOD(3.1, 2)`
fetched rows / total rows = 1/1
+-------------+---------------+
| MOD(3, 2)   | MOD(3.1, 2)   |
|-------------+---------------|
| 1           | 1.1           |
+-------------+---------------+

`PI`

用法：PI() 返回常量 pi。

返回类型：DOUBLE

示例：


os> source=people | eval `PI()` = PI() | fields `PI()`
fetched rows / total rows = 1/1
+-------------------+
| PI()              |
|-------------------|
| 3.141592653589793 |
+-------------------+

`POW`

用法：POW(x, y) 计算 x 的 y 次幂的值。错误输入会返回 NULL 结果。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

同义词：POWER(_, _)

示例：


os> source=people | eval `POW(3, 2)` = POW(3, 2), `POW(-3, 2)` = POW(-3, 2), `POW(3, -2)` = POW(3, -2) | fields `POW(3, 2)`, `POW(-3, 2)`, `POW(3, -2)`
fetched rows / total rows = 1/1
+-------------+--------------+--------------------+
| POW(3, 2)   | POW(-3, 2)   | POW(3, -2)         |
|-------------+--------------+--------------------|
| 9.0         | 9.0          | 0.1111111111111111 |
+-------------+--------------+--------------------+

POWER

用法：POWER(x, y) 计算 x 的 y 次幂的值。错误输入会返回 NULL 结果。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

同义词：POW(_, _)

示例：


os> source=people | eval `POWER(3, 2)` = POWER(3, 2), `POWER(-3, 2)` = POWER(-3, 2), `POWER(3, -2)` = POWER(3, -2) | fields `POWER(3, 2)`, `POWER(-3, 2)`, `POWER(3, -2)`
fetched rows / total rows = 1/1
+---------------+----------------+--------------------+
| POWER(3, 2)   | POWER(-3, 2)   | POWER(3, -2)       |
|---------------+----------------+--------------------|
| 9.0           | 9.0            | 0.1111111111111111 |
+---------------+----------------+--------------------+

`RADIANS`

用法：RADIANS(x) 将 x 的单位从度转换为弧度。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `RADIANS(90)` = RADIANS(90) | fields `RADIANS(90)`
fetched rows / total rows  = 1/1
+--------------------+
| RADIANS(90)        |
|--------------------|
| 1.5707963267948966 |
+--------------------+

`RAND`

用法：RAND()/RAND(N) 返回 0 <= 值 < 1.0 范围内的随机浮点数值。如果指定整数 N，该函数将在执行前初始化种子值。这种行为的一种含义是，使用相同的参数 N 时，rand(N) 每次都会返回相同的值，从而生成可重复的列值序列。

参数类型：INTEGER

返回类型：FLOAT

示例：


os> source=people | eval `RAND(3)` = RAND(3) | fields `RAND(3)`
fetched rows / total rows = 1/1
+------------+
| RAND(3)    |
|------------|
| 0.73105735 |
+------------+

`ROUND`

用法：ROUND(x, d) 将参数 x 四舍五入到 d 位小数。如果未指定 d，则默认值为 0。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型映射：

(INTEGER/LONG [,INTEGER]) -> LONG
(FLOAT/DOUBLE [,INTEGER]) -> LONG

示例：


os> source=people | eval `ROUND(12.34)` = ROUND(12.34), `ROUND(12.34, 1)` = ROUND(12.34, 1), `ROUND(12.34, -1)` = ROUND(12.34, -1), `ROUND(12, 1)` = ROUND(12, 1) | fields `ROUND(12.34)`, `ROUND(12.34, 1)`, `ROUND(12.34, -1)`, `ROUND(12, 1)`
fetched rows / total rows = 1/1
+----------------+-------------------+--------------------+----------------+
| ROUND(12.34)   | ROUND(12.34, 1)   | ROUND(12.34, -1)   | ROUND(12, 1)   |
|----------------+-------------------+--------------------+----------------|
| 12.0           | 12.3              | 10.0               | 12             |
+----------------+-------------------+--------------------+----------------+

`SIGN`

用法：SIGN 返回参数的符号值，为 -1、0 或 1，取决于该数值是负数、零还是正数。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：INTEGER

示例：


os> source=people | eval `SIGN(1)` = SIGN(1), `SIGN(0)` = SIGN(0), `SIGN(-1.1)` = SIGN(-1.1) | fields `SIGN(1)`, `SIGN(0)`, `SIGN(-1.1)`
fetched rows / total rows = 1/1
+-----------+-----------+--------------+
| SIGN(1)   | SIGN(0)   | SIGN(-1.1)   |
|-----------+-----------+--------------|
| 1         | 0         | -1           |
+-----------+-----------+--------------+

`SIN`

用法：sin(x) 计算 x 的正弦值，其中 x 的单位是弧度。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型：DOUBLE

示例：


os> source=people | eval `SIN(0)` = SIN(0) | fields `SIN(0)`
fetched rows / total rows = 1/1
+----------+
| SIN(0)   |
|----------|
| 0.0      |
+----------+

`SQRT`

用法：SQRT 计算非负数的平方根。

参数类型：INTEGER/LONG/FLOAT/DOUBLE

返回类型映射：

（非负数）INTEGER/LONG/FLOAT/DOUBLE-> 双精度
（负数）INTEGER/LONG/FLOAT/DOUBLE-> 空

示例：


os> source=people | eval `SQRT(4)` = SQRT(4), `SQRT(4.41)` = SQRT(4.41) | fields `SQRT(4)`, `SQRT(4.41)`
fetched rows / total rows = 1/1
+-----------+--------------+
| SQRT(4)   | SQRT(4.41)   |
|-----------+--------------|
| 2.0       | 2.1          |
+-----------+--------------+

PPL 字符串函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`CONCAT`

用法：CONCAT(str1, str2, ...., str_9) 最多可合并 9 个字符串。

参数类型：

STRING、STRING、....、STRING
返回类型：STRING

示例：


os> source=people | eval `CONCAT('hello', 'world')` = CONCAT('hello', 'world'), `CONCAT('hello ', 'whole ', 'world', '!')` = CONCAT('hello ', 'whole ', 'world', '!') | fields `CONCAT('hello', 'world')`, `CONCAT('hello ', 'whole ', 'world', '!')`
fetched rows / total rows = 1/1
+----------------------------+--------------------------------------------+
| CONCAT('hello', 'world')   | CONCAT('hello ', 'whole ', 'world', '!')   |
|----------------------------+--------------------------------------------|
| helloworld                 | hello whole world!                         |
+----------------------------+--------------------------------------------+

`CONCAT_WS`

用法：CONCAT_WS(sep, str1, str2) 使用指定的分隔符连接两个或多个字符串。

参数类型：

STRING、STRING、....、STRING
返回类型：STRING

示例：


os> source=people | eval `CONCAT_WS(',', 'hello', 'world')` = CONCAT_WS(',', 'hello', 'world') | fields `CONCAT_WS(',', 'hello', 'world')`
fetched rows / total rows = 1/1
+------------------------------------+
| CONCAT_WS(',', 'hello', 'world')   |
|------------------------------------|
| hello,world                        |
+------------------------------------+

`LENGTH`

用法：length(str) 返回输入字符串的长度，以字节为单位。

参数类型：

string
返回类型：INTEGER

示例：


os> source=people | eval `LENGTH('helloworld')` = LENGTH('helloworld') | fields `LENGTH('helloworld')`
fetched rows / total rows = 1/1
+------------------------+
| LENGTH('helloworld')   |
|------------------------|
| 10                     |
+------------------------+

`LOWER`

用法：lower(string) 将输入字符串转换为小写。

参数类型：

string
返回类型：STRING

示例：


os> source=people | eval `LOWER('helloworld')` = LOWER('helloworld'), `LOWER('HELLOWORLD')` = LOWER('HELLOWORLD') | fields `LOWER('helloworld')`, `LOWER('HELLOWORLD')`
fetched rows / total rows = 1/1
+-----------------------+-----------------------+
| LOWER('helloworld')   | LOWER('HELLOWORLD')   |
|-----------------------+-----------------------|
| helloworld            | helloworld            |
+-----------------------+-----------------------+

`LTRIM`

用法：ltrim(str) 删除输入字符串中的首空格字符。

参数类型：

string
返回类型：STRING

示例：


os> source=people | eval `LTRIM('   hello')` = LTRIM('   hello'), `LTRIM('hello   ')` = LTRIM('hello   ') | fields `LTRIM('   hello')`, `LTRIM('hello   ')`
fetched rows / total rows = 1/1
+---------------------+---------------------+
| LTRIM('   hello')   | LTRIM('hello   ')   |
|---------------------+---------------------|
| hello               | hello               |
+---------------------+---------------------+

`POSITION`

用法：POSITION(substr IN str) 返回字符串中子字符串首次出现的位置。如果在 string 中未找到 substring，则返回 0。如果任何参数为 NULL，则返回 NULL。

参数类型：

STRING、STRING
返回类型 INTEGER

示例：


os> source=people | eval `POSITION('world' IN 'helloworld')` = POSITION('world' IN 'helloworld'), `POSITION('invalid' IN 'helloworld')`= POSITION('invalid' IN 'helloworld')  | fields `POSITION('world' IN 'helloworld')`, `POSITION('invalid' IN 'helloworld')`
fetched rows / total rows = 1/1
+-------------------------------------+---------------------------------------+
| POSITION('world' IN 'helloworld')   | POSITION('invalid' IN 'helloworld')   |
|-------------------------------------+---------------------------------------|
| 6                                   | 0                                     |
+-------------------------------------+---------------------------------------+

`REVERSE`

用法：REVERSE(str) 返回输入字符串的反向字符串。

参数类型：

string
返回类型：STRING

示例：


os> source=people | eval `REVERSE('abcde')` = REVERSE('abcde') | fields `REVERSE('abcde')`
fetched rows / total rows = 1/1
+--------------------+
| REVERSE('abcde')   |
|--------------------|
| edcba              |
+--------------------+

`RIGHT`

用法：right(str, len) 返回输入字符串中最右侧的字符。如果在 string 中未找到 substring，则返回 0。如果任何参数为 NULL，则返回 NULL。

参数类型：

STRING、INTEGER
返回类型：STRING

示例：


os> source=people | eval `RIGHT('helloworld', 5)` = RIGHT('helloworld', 5), `RIGHT('HELLOWORLD', 0)` = RIGHT('HELLOWORLD', 0) | fields `RIGHT('helloworld', 5)`, `RIGHT('HELLOWORLD', 0)`
fetched rows / total rows = 1/1
+--------------------------+--------------------------+
| RIGHT('helloworld', 5)   | RIGHT('HELLOWORLD', 0)   |
|--------------------------+--------------------------|
| world                    |                          |
+--------------------------+--------------------------+

`RTRIM`

用法：rtrim(str) 删除输入字符串中的尾随空格字符。

参数类型：

string
返回类型：STRING

示例：


os> source=people | eval `RTRIM('   hello')` = RTRIM('   hello'), `RTRIM('hello   ')` = RTRIM('hello   ') | fields `RTRIM('   hello')`, `RTRIM('hello   ')`
fetched rows / total rows = 1/1
+---------------------+---------------------+
| RTRIM('   hello')   | RTRIM('hello   ')   |
|---------------------+---------------------|
|    hello            | hello               |
+---------------------+---------------------+

`SUBSTRING`

用法：substring(str, start) 或 substring(str, start, length) 返回输入字符串的子字符串。如果未指定长度，则返回从起始位置算起的整个字符串。

参数类型：

STRING、INTEGER、INTEGER
返回类型：STRING

示例：


os> source=people | eval `SUBSTRING('helloworld', 5)` = SUBSTRING('helloworld', 5), `SUBSTRING('helloworld', 5, 3)` = SUBSTRING('helloworld', 5, 3) | fields `SUBSTRING('helloworld', 5)`, `SUBSTRING('helloworld', 5, 3)`
fetched rows / total rows = 1/1
+------------------------------+---------------------------------+
| SUBSTRING('helloworld', 5)   | SUBSTRING('helloworld', 5, 3)   |
|------------------------------+---------------------------------|
| oworld                       | owo                             |
+------------------------------+---------------------------------+

`TRIM`

用法：trim(string) 删除输入字符串中的首空格和尾随空格。

参数类型：

string
返回类型：STRING

示例：


os> source=people | eval `TRIM('   hello')` = TRIM('   hello'), `TRIM('hello   ')` = TRIM('hello   ') | fields `TRIM('   hello')`, `TRIM('hello   ')`
fetched rows / total rows = 1/1
+--------------------+--------------------+
| TRIM('   hello')   | TRIM('hello   ')   |
|--------------------+--------------------|
| hello              | hello              |
+--------------------+--------------------+

`UPPER`

用法：upper(string) 将输入字符串转换为大写。

参数类型：

string
返回类型：STRING

示例：


os> source=people | eval `UPPER('helloworld')` = UPPER('helloworld'), `UPPER('HELLOWORLD')` = UPPER('HELLOWORLD') | fields `UPPER('helloworld')`, `UPPER('HELLOWORLD')`
fetched rows / total rows = 1/1
+-----------------------+-----------------------+
| UPPER('helloworld')   | UPPER('HELLOWORLD')   |
|-----------------------+-----------------------|
| HELLOWORLD            | HELLOWORLD            |
+-----------------------+-----------------------+

PPL 类型转换函数

注意

要查看哪些Amazon数据源集成支持此 PPL 功能，请参阅。函数

`TRIM`

用法：cast(expr as dateType) 将 expr 转换为 dataType，并返回 dataType 的值。

以下转换规则适用：

类型转换规则
Src/Target	string	NUMBER	BOOLEAN	TIMESTAMP	DATE	TIME
string		Note1	Note1	TIMESTAMP()	DATE()	TIME()
NUMBER	Note1		v!=0	不适用	不适用	不适用
BOOLEAN	Note1	v?1:0		不适用	不适用	不适用
TIMESTAMP	Note1	不适用	不适用		DATE()	TIME()
DATE	Note1	不适用	不适用	不适用		不适用
TIME	Note1	不适用	不适用	不适用	不适用

转换为字符串示例：


os> source=people | eval `cbool` = CAST(true as string), `cint` = CAST(1 as string), `cdate` = CAST(CAST('2012-08-07' as date) as string) | fields `cbool`, `cint`, `cdate`
fetched rows / total rows = 1/1
+---------+--------+------------+
| cbool   | cint   | cdate      |
|---------+--------+------------|
| true    | 1      | 2012-08-07 |
+---------+--------+------------+

转换为数字示例：


os> source=people | eval `cbool` = CAST(true as int), `cstring` = CAST('1' as int) | fields `cbool`, `cstring`
fetched rows / total rows = 1/1
+---------+-----------+
| cbool   | cstring   |
|---------+-----------|
| 1       | 1         |
+---------+-----------+

转换为日期示例：


os> source=people | eval `cdate` = CAST('2012-08-07' as date), `ctime` = CAST('01:01:01' as time), `ctimestamp` = CAST('2012-08-07 01:01:01' as timestamp) | fields `cdate`, `ctime`, `ctimestamp`
fetched rows / total rows = 1/1
+------------+----------+---------------------+
| cdate      | ctime    | ctimestamp          |
|------------+----------+---------------------|
| 2012-08-07 | 01:01:01 | 2012-08-07 01:01:01 |
+------------+----------+---------------------+

链式转换示例：


os> source=people | eval `cbool` = CAST(CAST(true as string) as boolean) | fields `cbool`
fetched rows / total rows = 1/1
+---------+
| cbool   |
|---------|
| True    |
+---------+

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

支持的 SQL 命令

监控域