Getting started with Amazon Redshift Spectrum - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Getting started with Amazon Redshift Spectrum

In this tutorial, you learn how to use Amazon Redshift Spectrum to query data directly from files on Amazon S3. If you already have a cluster and a SQL client, you can complete this tutorial with minimal setup.

Note

Redshift Spectrum queries incur additional charges. The cost of running the sample queries in this tutorial is nominal. For more information about pricing, see Amazon Redshift Spectrum pricing.

Prerequisites

To use Redshift Spectrum, you need an Amazon Redshift cluster and a SQL client that's connected to your cluster so that you can run SQL commands. The cluster and the data files in Amazon S3 must be in the same Amazon Web Services Region.

For information about how to create an Amazon Redshift cluster, see Amazon Redshift provisioned clusters in the Amazon Redshift Getting Started Guide. For information about ways to connect to a cluster, see Connecting to Amazon Redshift data warehouses in the Amazon Redshift Getting Started Guide.

In some of the examples that follow, the sample data is in the US East (N. Virginia) Region (us-east-1), so you need a cluster that is also in us-east-1. Or, you can use Amazon S3 to copy data objects from the following buckets and folders to your bucket in the Amazon Web Services Region where your cluster is located:

  • s3://redshift-downloads/tickit/spectrum/customers/*

  • s3://redshift-downloads/tickit/spectrum/sales_partition/*

  • s3://redshift-downloads/tickit/spectrum/sales/*

  • s3://redshift-downloads/tickit/spectrum/salesevent/*

Run an Amazon S3 command similar to the following to copy sample data that is located in the US East (N. Virginia) to your Amazon Web Services Region. Before running the command create your bucket and folders in your bucket to match your Amazon S3 copy command. The output of the Amazon S3 copy command confirms that the files are copied to the bucket-name in your desired Amazon Web Services Region.

aws s3 cp s3://redshift-downloads/tickit/spectrum/ s3://bucket-name/tickit/spectrum/ --copy-props none --recursive

Getting started with Redshift Spectrum using Amazon CloudFormation

As an alternative to the following steps, you can access the Redshift Spectrum DataLake Amazon CloudFormation template to create a stack with an Amazon S3 bucket that you can query. For more information, see Launch your Amazon CloudFormation stack and then query your data in Amazon S3.

Getting started with Redshift Spectrum step by step

To get started using Amazon Redshift Spectrum, follow these steps: