Creating views in the Amazon Glue Data Catalog (preview) - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Creating views in the Amazon Glue Data Catalog (preview)

This is prerelease documentation views in Data Catalog for Amazon Redshift, which is in preview release. The documentation and the feature are both subject to change. We recommend that you use this feature only with test clusters, and not in production environments. For preview terms and conditions, see Beta and Previews in Amazon Service Terms.

You can create an Amazon Redshift cluster in Preview to test new features of Amazon Redshift. You can't use those features in production or move your Preview cluster to a production cluster or a cluster on another track. For preview terms and conditions, see Beta and Previews in Amazon Service Terms.

To create a cluster in Preview
  1. Sign in to the Amazon Web Services Management Console and open the Amazon Redshift console at https://console.amazonaws.cn/redshiftv2/.

  2. On the navigation menu, choose Provisioned clusters dashboard, and choose Clusters. The clusters for your account in the current Amazon Web Services Region are listed. A subset of properties of each cluster is displayed in columns in the list.

  3. A banner displays on the Clusters list page that introduces preview. Choose the button Create preview cluster to open the create cluster page.

  4. Enter properties for your cluster. Choose the Preview track that contains the features you want to test. We recommend entering a name for the cluster that indicates that it is on a preview track. Choose options for your cluster, including options labeled as -preview, for the features you want to test. For general information about creating clusters, see Creating a cluster in the Amazon Redshift Management Guide.

  5. Choose Create cluster to create a cluster in preview.

    Note

    The preview_2023 track is the most recent preview track available. This track supports creating clusters with RA3 node types only. Node type DC2 and any older node type is not supported.

  6. When your preview cluster is available, use your SQL client to load and query data.

The preview feature Data Catalog views is available only in the following Regions.

  • US East (Ohio) (us-east-2)

  • US East (N. Virginia) (us-east-1)

  • US West (N. California) (us-west-1)

  • Asia Pacific (Tokyo) (ap-northeast-1)

  • Europe (Ireland) (eu-west-1)

  • Europe (Stockholm) (eu-north-1)

You can also create a preview workgroup to test Data Catalog views. You can't use those features in production or move your workgroup to another workgroup. For preview terms and conditions, see Beta and Previews in Amazon Service Terms. For instructions on how to create a preview workgroup, see Creating a preview workgroup.

By creating views in the Amazon Glue Data Catalog, you can create a single common view schema and metadata object to use across engines such as Amazon Athena and Amazon EMR Spark. Doing so lets you use the same views across your data lakes and data warehouses to fit your use cases. Views in the Data Catalog are special in that they are categorized as definer views, where access permissions are defined by the user who created the view instead of the user querying the view. The following are some use cases and benefits of creating views in the Data Catalog:

  • Create a view that restricts data access based on the permissions the user needs. For example, you can use views in the Data Catalog to prevent employees who don’t work in the HR department from seeing personally identifiable information (PII).

  • Make sure that users can’t access incomplete records. By applying certain filters onto your view in the Data Catalog, you make sure that data records inside a view in the Data Catalog are always complete.

  • Data Catalog views have an included security benefit of making sure that the query definition used to create the view must complete to create the view. This security benefit means that views in the Data Catalog are not susceptible to SQL commands from malicious players.

  • Views in the Data Catalog support the same advantages as normal views, such as letting users access a view without making the underlying table available to users.

To create a view in the Data Catalog, you must have a Spectrum external table, an object that’s contained within a Lake Formation-managed datashare, or an Apache Iceberg table.

Definitions of Data Catalog views are stored in the Amazon Glue Data Catalog. Use Amazon Lake Formation to grant access through resource grants, column grants, or tag-based access controls. For more information about granting and revoking access in Lake Formation, see Granting and revoking permissions on Data Catalog resources.

Prerequisites

Before you can create a view in the Data Catalog, make sure that you have the following prerequisites completed:

  • Make sure that your IAM role has the following trust policy.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": [ "glue.amazonaws.com", "lakeformation.amazonaws.com" ] }, "Action": "sts:AssumeRole" } ] }
  • You also need the following pass role policy.

    { "Version": "2012-10-17", "Statement": [ { "Sid": "Stmt1", "Action": [ "iam:PassRole" ], "Effect": "Allow", "Resource": "*", "Condition": { "StringEquals": { "iam:PassedToService": [ "glue.amazonaws.com", "lakeformation.amazonaws.com" ] } } } ] }
  • Finally, you also need the following permissions.

    • Glue:GetDatabase

    • Glue:GetDatabases

    • Glue:CreateTable

    • Glue:GetTable

    • Glue:UpdateTable

    • Glue:DeleteTable

    • Glue:GetTables

    • Glue:SearchTables

    • Glue:BatchGetPartition

    • Glue:GetPartitions

    • Glue:GetPartition

    • Glue:GetTableVersion

    • Glue:GetTableVersions

End-to-end example

Start by creating an external schema based on your Data Catalog database.

CREATE EXTERNAL SCHEMA IF NOT EXISTS external_schema FROM DATA CATALOG DATABASE 'external_data_catalog_db' IAM_ROLE 'arn:aws:iam::123456789012:role/sample-role';

You can now create a Data Catalog view.

CREATE EXTERNAL PROTECTED VIEW external_schema.remote_view AS SELECT * FROM external_schema.remote_table;

You can then start querying your view.

SELECT * FROM external_schema.remote_view;

For more information about the SQL commands related to views in the Data Catalog, see CREATE EXTERNAL VIEW, ALTER EXTERNAL VIEW, and DROP EXTERNAL VIEW.

Considerations and limitations

The following are considerations and limitations that apply to views created in the Data Catalog.

  • You can’t create a Data Catalog view that is based off of another view.

  • You can only have 10 base tables in a Data Catalog view.

  • The definer of the view must have full SELECT GRANTABLE permissions on the base tables.

  • Views can only contain Lake Formation objects and built-ins. The following objects are not permitted inside of a view.

    • System tables

    • User-defined functions (UDFs)

    • Redshift tables, views, materialized views, and late binding views that aren’t in a Lake Formation managed data share.

  • Views can’t contain nested Redshift Spectrum tables.

  • You can only query views by using two-dot notation. Querying Lake Formationviews from an externally mounted database is not supported.

  • The ARN of a Lake Formation table referenced in a Redshift view must be fewer than 127 characters long.

  • Amazon Glue representations of the base objects of a view must be in the same Amazon Web Services account and Region as the view.