What is a datashare? - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

What is a datashare?

A datashare is the unit of sharing data in Amazon Redshift. Use datashares to share data in the same Amazon Web Services account or different Amazon Web Services accounts. Also, share data for read purposes across different Amazon Redshift clusters.

Each datashare is associated with a specific database in your Amazon Redshift cluster.

A producer cluster administrator can create datashares and add datashare objects to share data with other clusters, referred to as outbound shares. A consumer cluster administrator can receive datashares from other clusters, referred to as inbound shares. For details on producers and consumers, see Datashare producers and consumers.

Datashare objects are objects from specific databases on a cluster that producer cluster administrators can add to datashares to be shared with data consumers. Datashare objects are read-only for data consumers. Examples of datashare objects are tables, views, and user-defined functions. You can add datashare objects to datashares while creating datashares or editing a datashare at any time.

Data sharing continues to work when clusters are resized or when the producer cluster is paused.

There are different types of datashares.

Standard datashares

With standard datashares, you can share data across provisioned clusters, serverless workgroups, Availability Zones, Amazon Web Services accounts, and Amazon Web Services Regions. You can share between cluster types as well as between provisioned clusters and Amazon Redshift Serverless.

To share data, note the following provisioned cluster, serverless namespace, and Amazon Web Services account identifiers:

  • Provisioned cluster namespaces are identifiers that identify Amazon Redshift provisioned clusters. A namespace globally unique identifier (GUID) is automatically created during provisioned cluster creation and attached to the cluster. A namespace Amazon Resource Name (ARN) is in the arn:{partition}:redshift:{region}:{account-id}:namespace:{namespace-guid} format. You can see the namespace of a provisioned cluster on the cluster details page on the Amazon Redshift console.

    In the data sharing workflow, the namespace GUID value and the cluster namespace ARN are used to share data with clusters in the Amazon Web Services account. You can also find the namespace for the current cluster by using the current_namespace function.

  • Serverless namespaces are identifiers that identify Amazon Redshift Serverless. A namespace globally unique identifier (GUID) is automatically created during Amazon Redshift Serverless creation and attached to the instance. A serverless namespace ARN is in the arn:{partition}:redshift-serverless:{region}:{account-id}:namespace/{namespace-guid} format.

  • Amazon Web Services accounts can be consumers for datashares and are each represented by a 12-digit Amazon Web Services account ID.

For standard datashares, consider the following:

  • When a producer cluster is deleted, Amazon Redshift deletes the datashares created by the producer cluster. When a producer cluster is backed up and restored, the created datashares still persist on the restored cluster. However, datashare permissions granted to other clusters are no longer valid on the restored cluster. Re-grant usage permissions of datashares to desired consumer clusters. The consumer database on the consumer cluster points to the datashare from the original cluster where the snapshot is taken. To query the shared data from the restored cluster, the consumer cluster administrator creates a different database. Or the administrator can drop and recreate an existing consumer database to use the datashare from the newly restored cluster.

  • When a consumer cluster is deleted and restored from a snapshot, the previous access shared to this cluster would no longer be valid and visible. If access to datashares is still required on the restored consumer cluster, the producer cluster administrator must grant usage of datashares to the restored consumer cluster again. The consumer cluster administrator must drop any stale consumer databases created from the inactive datashares. Then the administrator must recreate the consumer database from the datashare, after the producer re-granted the permissions. As the cluster namespace GUID is different on a restored cluster from the original cluster, re-grant datashare permissions when the consumer or producer cluster is restored from backup.

Datashare producers and consumers

Data producers (also known as data sharing producers or datashare producers) are clusters that you want to share data from. Producer cluster administrators and database owners can create datashares using the CREATE DATASHARE command. You can add objects such as schemas, tables, views, and SQL user-defined functions (UDFs) from a database that you want the producer cluster to share with consumer clusters.

Data producers (also known as providers on Amazon Web Services Data Exchange) for Amazon Web Services Data Exchange datashares can license data through Amazon Web Services Data Exchange. Approved providers can add Amazon Web Services Data Exchange datashares to Amazon Web Services Data Exchange products.

When a customer subscribes to a product with Amazon Web Services Data Exchange datashares, Amazon Web Services Data Exchange automatically adds the customer as a data consumer on all Amazon Web Services Data Exchange datashares included with the product. Amazon Web Services Data Exchange also removes all customers from Amazon Web Services Data Exchange datashares when their subscription ends. Amazon Web Services Data Exchange also automatically manages billing, invoicing, payment collection, and payment distribution for paid products with Amazon Web Services Data Exchange datashares. For more information, see Amazon Web Services Data Exchange datashares. To register as an Amazon Web Services Data Exchange data provider, see Getting started as a provider.

Data consumers (also known as data sharing consumers or datashare consumers) are clusters that receive datashares from producer clusters.

Amazon Redshift clusters that share data can be in the same or different Amazon Web Services accounts or different Amazon Web Services Regions, so you can share data across organizations and collaborate with other parties. Consumer cluster administrators receive the datashares that they are granted usage for and review the contents of each datashare. To consume shared data, the consumer cluster administrator creates an Amazon Redshift database from the datashare. The administrator then assigns permissions for the database to users and roles in the consumer cluster. After permissions are granted, users and roles can list the shared objects as part of the standard metadata queries, along with the local data on the consumer cluster. They can start querying immediately.

If you are a consumer with an active Amazon Web Services Data Exchange subscription (also known as subscribers on Amazon Web Services Data Exchange), you can find, subscribe to, and query granular, up-to-date data in Amazon Redshift without the need to extract, transform, and load the data. For more information, see Amazon Web Services Data Exchange datashares.