COPY from columnar data formats - Amazon Redshift
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

COPY from columnar data formats

COPY can load data from Amazon S3 in the following columnar formats:

  • ORC

  • Parquet

For examples of using COPY from columnar data formats, see COPY examples.

COPY supports columnar formatted data with the following restrictions:

  • The Amazon S3 bucket must be in the same Amazon Region as the Amazon Redshift database.

  • To access your Amazon S3 data through a VPC endpoint, set up access using IAM policies and IAM roles as described in Using Amazon Redshift Spectrum with Enhanced VPC Routing in the Amazon Redshift Management Guide.

  • COPY doesn't automatically apply compression encodings.

  • Only the following COPY parameters are supported:

  • If COPY encounters an error while loading, the command fails. ACCEPTANYDATE and MAXERROR aren't supported for columnar data types.

  • Error messages are sent to the SQL client. Some errors are logged in STL_LOAD_ERRORS and STL_ERROR.

  • COPY inserts values into the target table's columns in the same order as the columns occur in the columnar data files. The number of columns in the target table and the number of columns in the data file must match.

  • If the file you specify for the COPY operation includes one of the following extensions, we decompress the data without the need for adding any parameters:

    • .gz

    • .snappy

    • .bz2

  • COPY from the Parquet and ORC file formats uses Redshift Spectrum and the bucket access. To use COPY for these formats, be sure there are no IAM policies blocking the use of presigned URLs. For more information, see Using Amazon Redshift Spectrum with enhanced VPC routing.