Tutorial: Querying nested data with Amazon Redshift Spectrum - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Tutorial: Querying nested data with Amazon Redshift Spectrum


Amazon Redshift Spectrum supports querying nested data in Parquet, ORC, JSON, and Ion file formats. Redshift Spectrum accesses the data using external tables. You can create external tables that use the complex data types struct, array, and map.

For example, suppose that your data file contains the following data in Amazon S3 in a folder named customers. Although there isn't a single root element, each JSON object in this sample data represents a row in a table.

{"id": 1, "name": {"given": "John", "family": "Smith"}, "phones": ["123-457789"], "orders": [{"shipdate": "2018-03-01T11:59:59.000Z", "price": 100.50}, {"shipdate": "2018-03-01T09:10:00.000Z", "price": 99.12}] } {"id": 2, "name": {"given": "Jenny", "family": "Doe"}, "phones": ["858-8675309", "415-9876543"], "orders": [] } {"id": 3, "name": {"given": "Andy", "family": "Jones"}, "phones": [], "orders": [{"shipdate": "2018-03-02T08:02:15.000Z", "price": 13.50}] }

You can use Redshift Spectrum to query this data. The following tutorial shows you how to do so.

For tutorial prerequisites, steps, and nested data use cases, see the following topics:


If you are not using Redshift Spectrum yet, follow the steps in the Getting started with Amazon Redshift Spectrum tutorial before continuing.