Creating Amazon Redshift scalar columns out of shredded data - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Creating Amazon Redshift scalar columns out of shredded data

Schemaless data stored in SUPER can affect the performance of Amazon Redshift. For instance, filter predicates or join conditions as range-restricted scans can't effectively use zone maps. Users and BI tools can use materialized views as the conventional presentation of the data and increase performance of analytical queries.

The following query scans the materialized view super_mv and filters on o_orderstatus.

SELECT c.c_name, v.o_totalprice FROM customer_orders_lineitem c JOIN super_mv v ON c.c_custkey = v.c_custkey WHERE v.o_orderstatus = 'F';

Inspect stl_scan to verify that Amazon Redshift can't effectively use zone maps on the range-restricted scan over o_orderstatus.

SELECT slice, is_rrscan FROM stl_scan WHERE query = pg_last_query_id() AND perm_table_name LIKE '%super_mv%'; slice | is_rrscan -------+----------- 0 | f 1 | f 5 | f 4 | f 2 | f 3 | f (6 rows)

The following example adapts the materialized view super_mv to create scalar columns out of the shredded data. In this case, Amazon Redshift casts o_orderstatus from SUPER to VARCHAR. In addition, specify o_orderstatus as the sort key for super_mv.

CREATE MATERIALIZED VIEW super_mv distkey(c_custkey) sortkey(c_custkey, o_orderstatus) AS ( SELECT c_custkey, o.o_orderstatus::VARCHAR AS o_orderstatus, o.o_totalprice, o_idx FROM customer_orders_lineitem c, c.c_orders o AT o_idx );

After re-running the query, verify that Amazon Redshift can now use zone maps.

SELECT v.o_totalprice FROM super_mv v WHERE v.o_orderstatus = 'F';

You can verify that the range-restricted scan now uses zone maps as follows.

SELECT slice, is_rrscan FROM stl_scan WHERE query = pg_last_query_id() AND perm_table_name LIKE '%super_mv%'; slice | is_rrscan -------+----------- 0 | t 1 | t 2 | t 3 | t 4 | t 5 | t (6 rows)