CREATE MATERIALIZED VIEW - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

CREATE MATERIALIZED VIEW

Creates a materialized view based on one or more Amazon Redshift tables. You can also base materialized views on external tables created using Spectrum or federated query. For information about Spectrum, see Querying external data using Amazon Redshift Spectrum. For information about federated query, see Querying data with federated queries in Amazon Redshift.

Syntax

CREATE MATERIALIZED VIEW mv_name
[ BACKUP { YES | NO } ]
[ table_attributes ]
[ AUTO REFRESH { YES | NO } ]
AS query 

Parameters

BACKUP

A clause that specifies whether the materialized view is included in automated and manual cluster snapshots, which are stored in Amazon S3.

The default value for BACKUP is YES.

You can specify BACKUP NO to save processing time when creating snapshots and restoring from snapshots, and to reduce the amount of storage required in Amazon S3.

Note

The BACKUP NO setting has no effect on automatic replication of data to other nodes within the cluster, so tables with BACKUP NO specified are restored in a node failure.

table_attributes

A clause that specifies how the data in the materialized view is distributed, including the following:

  • The distribution style for the materialized view, in the format DISTSTYLE { EVEN | ALL | KEY }. If you omit this clause, the distribution style is EVEN. For more information, see Distribution styles.

  • The distribution key for the materialized view, in the format DISTKEY ( distkey_identifier ). For more information, see Designating distribution styles.

  • The sort key for the materialized view, in the format SORTKEY ( column_name [, ...] ). For more information, see Working with sort keys.

AS query

A valid SELECT statement that defines the materialized view and its content. The result set from the query defines the columns and rows of the materialized view. For information about limitations when creating materialized views, see Limitations.

Furthermore, specific SQL language constructs used in the query determines whether the materialized view can be incrementally or fully refreshed. For information about the refresh method, see REFRESH MATERIALIZED VIEW. For information about the limitations for incremental refresh, see Limitations for incremental refresh.

If the query contains an SQL command that doesn't support incremental refresh, Amazon Redshift displays a message indicating that the materialized view will use a full refresh. The message may or may not be displayed, depending on the SQL client application. Check the state column of the STV_MV_INFO to see the refresh type used by a materialized view.

AUTO REFRESH

A clause that defines whether the materialized view should be automatically refreshed with latest changes from its base tables. The default value is NO. For more information, see Refreshing a materialized view.

Usage notes

To create a materialized view, you must have the following privileges:

  • CREATE privileges for a schema.

  • Table-level or column-level SELECT privilege on the base tables to create a materialized view. If you have column-level privileges on specific columns, you can create a materialized view on only those columns.

Incremental refresh for materialized views in a datashare

Amazon Redshift supports automatic and incremental refresh for materialized views in a consumer datashare when the base tables are shared. Incremental refresh is an operation where Amazon Redshift identifies changes in the base table or tables that happened after the previous refresh and updates only the corresponding records in the materialized view. This runs more quickly than a full refresh and improves workload performance. You don't have to change your materialized-view definition to take advantage of incremental refresh.

There are a couple limitations to note for taking advantage of incremental refresh with a materialized view:

  • The materialized view must reference only one database, either local or remote.

  • Incremental refresh is available only on new materialized views. Therefore, you must drop existing materialized views and recreate them for incremental refresh to occur.

For more information about creating materialized views in a datashare, see Working with views in Amazon Redshift data sharing, which contains several query examples.

DDL updates to materialized views or base tables

When using materialized views in Amazon Redshift, follow these usage notes for data definition language (DDL) updates to materialized views or base tables.

  • You can add columns to a base table without affecting any materialized views that reference the base table.

  • Some operations can leave the materialized view in a state that can't be refreshed at all. Examples are operations such as renaming or dropping a column, changing the type of a column, and changing the name of a schema. Such materialized views can be queried but can't be refreshed. In this case, you must drop and recreate the materialized view.

  • In general, you can't alter a materialized view's definition (its SQL statement).

  • You can't rename a materialized view.

Limitations

You can't define a materialized view that references or includes any of the following:

  • Standard views, or system tables and views.

  • Temporary tables.

  • User-defined functions.

  • The ORDER BY, LIMIT, or OFFSET clause.

  • late-binding references to base tables. In other words, any base tables or related columns referenced in the defining SQL query of the materialized view must exist and must be valid.

  • Leader node-only functions: CURRENT_SCHEMA, CURRENT_SCHEMAS, HAS_DATABASE_PRIVILEGE, HAS_SCHEMA_PRIVILEGE, HAS_TABLE_PRIVILEGE.

  • You can't use the AUTO REFRESH YES option when the materialized view definition includes mutable functions or external schemas. You also can't use it when you define a materialized view on another materialized view.

  • You don't have to manually run ANALYZE on materialized views. This happens currently only via AUTO ANALYZE. For more information, see Analyzing tables.

Examples

The following example creates a materialized view from three base tables that are joined and aggregated. Each row represents a category with the number of tickets sold. When you query the tickets_mv materialized view, you directly access the precomputed data in the tickets_mv materialized view.

CREATE MATERIALIZED VIEW tickets_mv AS select catgroup, sum(qtysold) as sold from category c, event e, sales s where c.catid = e.catid and e.eventid = s.eventid group by catgroup;

The following example creates a materialized view similar to the previous example and uses the aggregate function MAX().

CREATE MATERIALIZED VIEW tickets_mv_max AS select catgroup, max(qtysold) as sold from category c, event e, sales s where c.catid = e.catid and e.eventid = s.eventid group by catgroup; SELECT name, state FROM STV_MV_INFO;

The following example uses a UNION ALL clause to join the Amazon Redshift public_sales table and the Redshift Spectrum spectrum.sales table to create a material view mv_sales_vw. For information about the CREATE EXTERNAL TABLE command for Amazon Redshift Spectrum, see CREATE EXTERNAL TABLE. The Redshift Spectrum external table references the data on Amazon S3.

CREATE MATERIALIZED VIEW mv_sales_vw as select salesid, qtysold, pricepaid, commission, saletime from public.sales union all select salesid, qtysold, pricepaid, commission, saletime from spectrum.sales

The following example creates a materialized view mv_fq based on a federated query external table. For information about federated query, see CREATE EXTERNAL SCHEMA.

CREATE MATERIALIZED VIEW mv_fq as select firstname, lastname from apg.mv_fq_example; select firstname, lastname from mv_fq; firstname | lastname -----------+---------- John | Day Jane | Doe (2 rows)

The following example shows the definition of a materialized view.

SELECT pg_catalog.pg_get_viewdef('mv_sales_vw'::regclass::oid, true); pg_get_viewdef --------------------------------------------------- create materialized view mv_sales_vw as select a from t;

The following sample shows how to set AUTO REFRESH in the materialized view definition and also specifies a DISTSTYLE. First, create a simple base table.

CREATE TABLE baseball_table (ball int, bat int);

Then, create a materialized view.

CREATE MATERIALIZED VIEW mv_baseball DISTSTYLE ALL AUTO REFRESH YES AS SELECT ball AS baseball FROM baseball_table;

Now you can query the mv_baseball materialized view. To check if AUTO REFRESH is turned on for a materialized view, see STV_MV_INFO.

The following sample creates a materialized view that references a source table in another database. It assumes that the database containing the source table, database_A, is in the same cluster or workgroup as your materialized view, which you create in database_B. (You can substitute your own databases for the sample.) First, create a table in database_A called cities, with a cityname column. Make the column's data type a VARCHAR. After you create the source table, run the following command in database_B to create a materialized view whose source is your cities table. Make sure to specify the source table's database and schema in the FROM clause:

CREATE MATERIALIZED VIEW cities_mv AS SELECT  cityname FROM    database_A.public.cities;

Query the materialized view you created. The query retrieves records whose original source is the cities table in database_A:

select * from cities_mv;

When you run the SELECT statement, cities_mv returns the records. Records are refreshed from the source table only when a REFRESH statement is run. Also, note that you can't update records directly in the materialized view. For information about refreshing the data in a materialized view, see REFRESH MATERIALIZED VIEW.

For details about materialized view overview and SQL commands used to refresh and drop materialized views, see the following topics: