Search for values in JSON arrays
To determine if a specific value exists inside a JSON-encoded array, use the json_array_contains
function.
The following query lists the names of the users who are participating in "project2".
WITH dataset AS ( SELECT * FROM (VALUES (JSON '{"name": "Bob Smith", "org": "legal", "projects": ["project1"]}'), (JSON '{"name": "Susan Smith", "org": "engineering", "projects": ["project1", "project2", "project3"]}'), (JSON '{"name": "Jane Smith", "org": "finance", "projects": ["project1", "project2"]}') ) AS t (users) ) SELECT json_extract_scalar(users, '$.name') AS user FROM dataset WHERE json_array_contains(json_extract(users, '$.projects'), 'project2')
This query returns a list of users.
+-------------+
| user |
+-------------+
| Susan Smith |
+-------------+
| Jane Smith |
+-------------+
The following query example lists the names of users who have completed projects along with the total number of completed projects. It performs these actions:
-
Uses nested
SELECT
statements for clarity. -
Extracts the array of projects.
-
Converts the array to a native array of key-value pairs using
CAST
. -
Extracts each individual array element using the
UNNEST
operator. -
Filters obtained values by completed projects and counts them.
Note
When using CAST
to MAP
you can specify the key element as VARCHAR
(native String in Presto), but leave the value as JSON, because the values in the MAP
are of different types: String for the first key-value pair, and Boolean for the second.
WITH dataset AS ( SELECT * FROM (VALUES (JSON '{"name": "Bob Smith", "org": "legal", "projects": [{"name":"project1", "completed":false}]}'), (JSON '{"name": "Susan Smith", "org": "engineering", "projects": [{"name":"project2", "completed":true}, {"name":"project3", "completed":true}]}'), (JSON '{"name": "Jane Smith", "org": "finance", "projects": [{"name":"project2", "completed":true}]}') ) AS t (users) ), employees AS ( SELECT users, CAST(json_extract(users, '$.projects') AS ARRAY(MAP(VARCHAR, JSON))) AS projects_array FROM dataset ), names AS ( SELECT json_extract_scalar(users, '$.name') AS name, projects FROM employees, UNNEST (projects_array) AS t(projects) ) SELECT name, count(projects) AS completed_projects FROM names WHERE cast(element_at(projects, 'completed') AS BOOLEAN) = true GROUP BY name
This query returns the following result:
+----------------------------------+
| name | completed_projects |
+----------------------------------+
| Susan Smith | 2 |
+----------------------------------+
| Jane Smith | 1 |
+----------------------------------+