Overview of data sharing in Amazon Redshift - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Overview of data sharing in Amazon Redshift

With data sharing, you can securely and easily share live data across Amazon Redshift clusters for read purposes.

For information about how to get started working with data sharing and manage datashares using the Amazon Web Services Management Console, see Managing data sharing tasks.

To learn how Amazon Redshift data sharing works, watch the following video.

Data sharing use cases for Amazon Redshift

Amazon Redshift data sharing is especially useful for these use cases:

  • Supporting different kinds of business-critical workloads – Use a central extract, transform, and load (ETL) cluster that shares data with multiple business intelligence (BI) or analytic clusters. This approach provides read workload isolation and chargeback for individual workloads. You can size and scale your individual workload compute according to the workload-specific requirements of price and performance.

  • Enabling cross-group collaboration – Enable seamless collaboration across teams and business groups for broader analytics, data science, and cross-product impact analysis.

  • Delivering data as a service – Share data as a service across your organization.

  • Sharing data between environments – Share data among development, test, and production environments. You can improve team agility by sharing data at different levels of granularity.

  • Licensing access to data in Amazon Redshift – List Amazon Redshift data sets in the Amazon Web Services Data Exchange catalog that customers can find, subscribe to, and query in minutes.

To learn about Amazon Redshift data sharing use cases, watch the following video.

Sharing data at different levels in Amazon Redshift

With Amazon Redshift, you can share data at different levels. These levels include databases, schemas, tables, views (including regular, late-binding, and materialized views), and SQL user-defined functions (UDFs). You can create multiple datashares for a given database. A datashare can contain objects from multiple schemas in the database on which sharing is created.

By having this flexibility in sharing data, you get fine-grained access control. You can tailor this control for different users and businesses that need access to Amazon Redshift data.

Managing data consistency in Amazon Redshift

Amazon Redshift provides transactional consistency on all producer and consumer clusters and shares up-to-date and consistent views of the data with all consumers.

You can continuously update data on the producer cluster. All queries on a consumer cluster within a transaction read the same state of the shared data. Amazon Redshift doesn't consider the data that was changed by another transaction on the producer cluster that was committed after the beginning of the transaction on the consumer cluster. After the data change is committed on the producer cluster, new transactions on the consumer cluster can immediately query the updated data.

The strong consistency removes the risks of lower-fidelity business reports that might contain invalid results during sharing of data. This factor is especially important for financial analysis or where the results might be used to prepare datasets that are used to train machine learning models.