展平嵌套数组 - Amazon Athena
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

展平嵌套数组

当使用嵌套数组时,您通常需要将嵌套数组元素展开到单个阵列中,或将元素展开到多个行中。

示例

要将嵌套数组的元素展平为单个值数组,请使用 flatten 函数。此查询为数组中的每个元素返回一行。

SELECT flatten(ARRAY[ ARRAY[1,2], ARRAY[3,4] ]) AS items

此查询返回:

+-----------+ | items | +-----------+ | [1,2,3,4] | +-----------+

要将数组展平为多个行,请将 CROSS JOINUNNEST 运算符结合使用,如以下示例所示:

WITH dataset AS ( SELECT 'engineering' as department, ARRAY['Sharon', 'John', 'Bob', 'Sally'] as users ) SELECT department, names FROM dataset CROSS JOIN UNNEST(users) as t(names)

此查询返回:

+----------------------+ | department | names | +----------------------+ | engineering | Sharon | +----------------------| | engineering | John | +----------------------| | engineering | Bob | +----------------------| | engineering | Sally | +----------------------+

展平一个键值对数组,将选定的键变换到列中,如以下示例所示:

WITH dataset AS ( SELECT 'engineering' as department, ARRAY[ MAP(ARRAY['first', 'last', 'age'],ARRAY['Bob', 'Smith', '40']), MAP(ARRAY['first', 'last', 'age'],ARRAY['Jane', 'Doe', '30']), MAP(ARRAY['first', 'last', 'age'],ARRAY['Billy', 'Smith', '8']) ] AS people ) SELECT names['first'] AS first_name, names['last'] AS last_name, department FROM dataset CROSS JOIN UNNEST(people) AS t(names)

此查询返回:

+--------------------------------------+ | first_name | last_name | department | +--------------------------------------+ | Bob | Smith | engineering | | Jane | Doe | engineering | | Billy | Smith | engineering | +--------------------------------------+

从员工列表中,选择具有最高组合分数的员工。可以在 FROM 子句中使用 UNNEST,而无需前面的 CROSS JOIN,因为它是默认的联接运算符,因此是隐含的。

WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))) ] AS users ), users AS ( SELECT person, score FROM dataset, UNNEST(dataset.users) AS t(person), UNNEST(person.scores) AS t(score) ) SELECT person.name, person.department, SUM(score) AS total_score FROM users GROUP BY (person.name, person.department) ORDER BY (total_score) DESC LIMIT 1

此查询返回:

+---------------------------------+ | name | department | total_score | +---------------------------------+ | Amy | devops | 54 | +---------------------------------+

从员工列表中,选择具有最高个人分数的员工。

WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))) ] AS users ), users AS ( SELECT person, score FROM dataset, UNNEST(dataset.users) AS t(person), UNNEST(person.scores) AS t(score) ) SELECT person.name, score FROM users ORDER BY (score) DESC LIMIT 1

此查询返回:

+--------------+ | name | score | +--------------+ | Amy | 15 | +--------------+

注意事项和限制

如果在查询中的一个或多个数组上使用 UNNEST,并且其中一个数组是 NULL,则查询不返回任何行。如果在空字符串数组上使用 UNNEST,则返回空字符串。

例如,在以下查询中,由于第二个数组为空值,因此查询不返回任何行。

SELECT col1, col2 FROM UNNEST (ARRAY ['apples','oranges','lemons']) AS t(col1) CROSS JOIN UNNEST (ARRAY []) AS t(col2)

在下一个示例中,第二个数组被修改为包含一个空字符串。对于每一行,查询都会返回 col1 中的值,并为 col2 中的值返回空字符串。要返回第一个数组中的值,需要第二个数组中的空字符串。

SELECT col1, col2 FROM UNNEST (ARRAY ['apples','oranges','lemons']) AS t(col1) CROSS JOIN UNNEST (ARRAY ['']) AS t(col2)