

 Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://amazonaws-china.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# COPY
COPY


|  | 
| --- |
|  Client-side encryption for COPY and UNLOAD commands will no longer be open to new customers starting April 30, 2025. If you used client-side encryption with COPY and UNLOAD commands in the 12 months before April 30, 2025, you can continue to use client side encryption with COPY or UNLOAD commands until April 30, 2026. After April 30, 2026, you won't be able to use client-side encryption for COPY and UNLOAD. We recommend that you switch to using server-side encryption for COPY and UNLOAD as soon as possible. If you're already using server-side encryption for COPY and UNLOAD, there's no change and you can continue to use it without altering your queries. For more information on encryption for COPY and UNLOAD, see the ENCRYPTED parameter below.  | 

Loads data into a table from data files or from an Amazon DynamoDB table. The files can be located in an Amazon Simple Storage Service (Amazon S3) bucket, an Amazon EMR cluster, or a remote host that is accessed using a Secure Shell (SSH) connection.

**Note**  
Amazon Redshift Spectrum external tables are read-only. You can't COPY to an external table.

The COPY command appends the input data as additional rows to the table.

The maximum size of a single input row from any source is 4 MB.

**Topics**
+ [

## Required permissions
](#r_COPY-permissions)
+ [

## COPY syntax
](#r_COPY-syntax)
+ [

## Required parameters
](#r_COPY-syntax-required-parameters)
+ [

## Optional parameters
](#r_COPY-syntax-overview-optional-parameters)
+ [

## Usage notes and additional resources for the COPY command
](#r_COPY-using-the-copy-command)
+ [

## COPY command examples
](#r_COPY-using-the-copy-command-examples)
+ [

# COPY JOB
](r_COPY-JOB.md)
+ [

# COPY with TEMPLATE
](r_COPY-WITH-TEMPLATE.md)
+ [

# COPY parameter reference
](r_COPY-parameters.md)
+ [

# Usage notes
](r_COPY_usage_notes.md)
+ [

# COPY examples
](r_COPY_command_examples.md)

## Required permissions
Required permissions for COPY

To use the COPY command, you must have [INSERT](r_GRANT.md#grant-insert) privilege for the Amazon Redshift table.

## COPY syntax
COPY syntax

```
COPY table-name 
[ column-list ]
FROM data_source
authorization
[ [ FORMAT ] [ AS ] data_format ] 
[ parameter [ argument ] [, ... ] ]
```

You can perform a COPY operation with as few as three parameters: a table name, a data source, and authorization to access the data. 

Amazon Redshift extends the functionality of the COPY command to enable you to load data in several data formats from multiple data sources, control access to load data, manage data transformations, and manage the load operation. 

The following sections present the required COPY command parameters, grouping the optional parameters by function. They also describe each parameter and explain how various options work together. You can go directly to a parameter description by using the alphabetical parameter list. 

## Required parameters


The COPY command requires three elements: 
+ [Table Name](#r_COPY-syntax-overview-table-name)
+ [Data Source](#r_COPY-syntax-overview-data-source)
+ [Authorization](#r_COPY-syntax-overview-credentials)

The simplest COPY command uses the following format. 

```
COPY table-name 
FROM data-source
authorization;
```

The following example creates a table named CATDEMO, and then loads the table with sample data from a data file in Amazon S3 named `category_pipe.txt`. 

```
create table catdemo(catid smallint, catgroup varchar(10), catname varchar(10), catdesc varchar(50));
```

In the following example, the data source for the COPY command is a data file named `category_pipe.txt` in the `tickit` folder of an Amazon S3 bucket named `redshift-downloads`. The COPY command is authorized to access the Amazon S3 bucket through an Amazon Identity and Access Management (IAM) role. If your cluster has an existing IAM role with permission to access Amazon S3 attached, you can substitute your role's Amazon Resource Name (ARN) in the following COPY command and run it.

```
copy catdemo
from 's3://redshift-downloads/tickit/category_pipe.txt'
iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>'
region 'us-east-1';
```

For complete instructions on how to use COPY commands to load sample data, including instructions for loading data from other Amazon regions, see [Load Sample Data from Amazon S3](https://docs.amazonaws.cn/redshift/latest/gsg/rs-gsg-create-sample-db.html) in the Amazon Redshift Getting Started Guide.

*table-name*  <a name="r_COPY-syntax-overview-table-name"></a>
The name of the target table for the COPY command. The table must already exist in the database. The table can be temporary or persistent. The COPY command appends the new input data to any existing rows in the table.

FROM *data-source*  <a name="r_COPY-syntax-overview-data-source"></a>
The location of the source data to be loaded into the target table. A manifest file can be specified with some data sources.   
The most commonly used data repository is an Amazon S3 bucket. You can also load from data files located in an Amazon EMR cluster, an Amazon EC2 instance, or a remote host that your cluster can access using an SSH connection, or you can load directly from a DynamoDB table.   
+ [COPY from Amazon S3](copy-parameters-data-source-s3.md)
+ [COPY from Amazon EMR](copy-parameters-data-source-emr.md) 
+ [COPY from remote host (SSH)](copy-parameters-data-source-ssh.md)
+ [COPY from Amazon DynamoDB](copy-parameters-data-source-dynamodb.md)

Authorization  <a name="r_COPY-syntax-overview-credentials"></a>
A clause that indicates the method that your cluster uses for authentication and authorization to access other Amazon resources. The COPY command needs authorization to access data in another Amazon resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an IAM role that is attached to your cluster or by providing the access key ID and secret access key for an IAM user.   
+ [Authorization parameters](copy-parameters-authorization.md) 
+ [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based) 
+ [Key-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-key-based) 

## Optional parameters


You can optionally specify how COPY maps field data to columns in the target table, define source data attributes to enable the COPY command to correctly read and parse the source data, and manage which operations the COPY command performs during the load process. 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](#r_COPY-syntax-overview-data-format)
+ [Data conversion parameters](#r_COPY-syntax-overview-data-conversion)
+ [Data load operations](#r_COPY-syntax-overview-data-load)

### Column mapping


By default, COPY inserts field values into the target table's columns in the same order as the fields occur in the data files. If the default column order will not work, you can specify a column list or use JSONPath expressions to map source data fields to the target columns. 
+ [Column List](copy-parameters-column-mapping.md#copy-column-list)
+ [JSONPaths File](copy-parameters-column-mapping.md#copy-column-mapping-jsonpaths)

### Data format parameters


You can load data from text files in fixed-width, character-delimited, comma-separated values (CSV), or JSON format, or from Avro files.

By default, the COPY command expects the source data to be in character-delimited UTF-8 text files. The default delimiter is a pipe character ( \$1 ). If the source data is in another format, use the following parameters to specify the data format.
+ [FORMAT](copy-parameters-data-format.md#copy-format)
+ [CSV](copy-parameters-data-format.md#copy-csv)
+ [DELIMITER](copy-parameters-data-format.md#copy-delimiter) 
+ [FIXEDWIDTH](copy-parameters-data-format.md#copy-fixedwidth) 
+ [SHAPEFILE](copy-parameters-data-format.md#copy-shapefile) 
+ [AVRO](copy-parameters-data-format.md#copy-avro) 
+ [JSON format for COPY](copy-parameters-data-format.md#copy-json) 
+ [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) 
+ [BZIP2](copy-parameters-file-compression.md#copy-bzip2) 
+ [GZIP](copy-parameters-file-compression.md#copy-gzip) 
+ [LZOP](copy-parameters-file-compression.md#copy-lzop) 
+ [PARQUET](copy-parameters-data-format.md#copy-parquet) 
+ [ORC](copy-parameters-data-format.md#copy-orc) 
+ [ZSTD](copy-parameters-file-compression.md#copy-zstd) 

### Data conversion parameters


As it loads the table, COPY attempts to implicitly convert the strings in the source data to the data type of the target column. If you need to specify a conversion that is different from the default behavior, or if the default conversion results in errors, you can manage data conversions by specifying the following parameters.
+ [ACCEPTANYDATE](copy-parameters-data-conversion.md#copy-acceptanydate) 
+ [ACCEPTINVCHARS](copy-parameters-data-conversion.md#copy-acceptinvchars) 
+ [BLANKSASNULL](copy-parameters-data-conversion.md#copy-blanksasnull) 
+ [DATEFORMAT](copy-parameters-data-conversion.md#copy-dateformat) 
+ [EMPTYASNULL](copy-parameters-data-conversion.md#copy-emptyasnull) 
+ [ENCODING](copy-parameters-data-conversion.md#copy-encoding) 
+ [ESCAPE](copy-parameters-data-conversion.md#copy-escape) 
+ [EXPLICIT_IDS](copy-parameters-data-conversion.md#copy-explicit-ids) 
+ [FILLRECORD](copy-parameters-data-conversion.md#copy-fillrecord) 
+ [IGNOREBLANKLINES](copy-parameters-data-conversion.md#copy-ignoreblanklines) 
+ [IGNOREHEADER](copy-parameters-data-conversion.md#copy-ignoreheader) 
+ [NULL AS](copy-parameters-data-conversion.md#copy-null-as) 
+ [REMOVEQUOTES](copy-parameters-data-conversion.md#copy-removequotes) 
+ [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec) 
+ [TIMEFORMAT](copy-parameters-data-conversion.md#copy-timeformat) 
+ [TRIMBLANKS](copy-parameters-data-conversion.md#copy-trimblanks) 
+ [TRUNCATECOLUMNS](copy-parameters-data-conversion.md#copy-truncatecolumns) 

### Data load operations


Manage the default behavior of the load operation for troubleshooting or to reduce load times by specifying the following parameters. 
+ [COMPROWS](copy-parameters-data-load.md#copy-comprows) 
+ [COMPUPDATE](copy-parameters-data-load.md#copy-compupdate) 
+ [IGNOREALLERRORS](copy-parameters-data-load.md#copy-ignoreallerrors) 
+ [MAXERROR](copy-parameters-data-load.md#copy-maxerror) 
+ [NOLOAD](copy-parameters-data-load.md#copy-noload) 
+ [STATUPDATE](copy-parameters-data-load.md#copy-statupdate) 

## Usage notes and additional resources for the COPY command


For more information about how to use the COPY command, see the following topics: 
+ [Usage notes](r_COPY_usage_notes.md)
+ [Tutorial: Loading data from Amazon S3](tutorial-loading-data.md)
+ [Amazon Redshift best practices for loading data](c_loading-data-best-practices.md)
+ [Loading tables with the COPY command](t_Loading_tables_with_the_COPY_command.md)
  + [Loading data from Amazon S3](t_Loading-data-from-S3.md)
  + [Loading data from Amazon EMR](loading-data-from-emr.md)
  + [Loading data from remote hosts](loading-data-from-remote-hosts.md) 
  + [Loading data from an Amazon DynamoDB table](t_Loading-data-from-dynamodb.md)
+ [Troubleshooting data loads](t_Troubleshooting_load_errors.md)

## COPY command examples


For more examples that show how to COPY from various sources, in disparate formats, and with different COPY options, see [COPY examples](r_COPY_command_examples.md).

# COPY JOB
COPY JOB

For information about using this command, see [Create an S3 event integration to automatically copy files from Amazon S3 buckets](loading-data-copy-job.md).

Manages COPY commands that load data into a table. The COPY JOB command is an extension of the COPY command and automates data loading from Amazon S3 buckets. When you create a COPY job, Amazon Redshift detects when new Amazon S3 files are created in a specified path, and then loads them automatically without your intervention. The same parameters that are used in the original COPY command are used when loading the data. Amazon Redshift keeps track of the loaded files (based on filename) to verify that they are loaded only one time.

**Note**  
For information about the COPY command, including usage, parameters, and permissions, see [COPY](r_COPY.md).

## Required permission
Required permission

To use the COPY JOB command, you must have one of the following permissions in addition to all of the required permissions to use COPY:
+ Superuser
+  All of the following: 
  +  The relevant CREATE, ALTER, or DROP scoped permission for COPY JOBS in the database you want to COPY to. 
  +  USAGE permission for the schema you want to COPY to, or USAGE scoped permission for schemas in the database you want to COPY to. 
  +  INSERT permission for the table you want to COPY to, or INSERT scoped permission for tables in the schema or database you want to COPY to. 

The IAM role specified with the COPY command must have permission to access the data to load. For more information, see [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](copy-usage_notes-access-permissions.md#copy-usage_notes-iam-permissions).

## Syntax
Syntax

Create a copy job. The parameters of the COPY command are saved with the copy job.

You can't run COPY JOB CREATE within the scope of a transaction block.

```
COPY copy-command JOB CREATE job-name
[AUTO ON | OFF]
```

Change the configuration of a copy job.

```
COPY JOB ALTER job-name
[AUTO ON | OFF]
```

Run a copy job. The stored COPY command parameters are used.

```
COPY JOB RUN job-name
```

List all copy jobs.

```
COPY JOB LIST
```

Show the details of a copy job.

```
COPY JOB SHOW job-name
```

Delete a copy job.

You can't run COPY JOB DROP within the scope of a transaction block.

```
COPY JOB DROP job-name
```

## Parameters
Parameters

*copy-command*  
A COPY command that loads data from Amazon S3 to Amazon Redshift. The clause contains COPY parameters that define the Amazon S3 bucket, target table, IAM role, and other parameters used when loading data. All COPY command parameters for an Amazon S3 data load are supported except:  
+ The COPY JOB does not ingest preexisting files in the folder pointed to by the COPY command. Only files created after the COPY JOB creation timestamp are ingested.
+ You cannot specify a COPY command with the MAXERROR or IGNOREALLERRORS options.
+ You cannot specify a manifest file. COPY JOB requires a designated Amazon S3 location to monitor for newly created files.
+ You cannot specify a COPY command with authorization types like Access and Secret keys. Only COPY commands that use the `IAM_ROLE` parameter for authorization are supported. For more information, see [Authorization parameters](copy-parameters-authorization.md).
+ The COPY JOB doesn't support the default IAM role associated with the cluster. You must specify the `IAM_ROLE` in the COPY command. 
For more information, see [COPY from Amazon S3](copy-parameters-data-source-s3.md).

*job-name*  
The name of the job used to reference the COPY job. The *job-name* can't contain a hyphen (‐).

 [AUTO ON \$1 OFF]   
Clause that indicates whether Amazon S3 data is automatically loaded into Amazon Redshift tables.  
+ When `ON`, Amazon Redshift monitors the source Amazon S3 path for newly created files, and if found, a COPY command is run with the COPY parameters in the job definition. This is the default.
+ When `OFF`, Amazon Redshift does not run the COPY JOB automatically.

## Usage notes
Usage notes

The options of the COPY command aren't validated until run time. For example, an invalid `IAM_ROLE` or an Amazon S3 data source results in runtime errors when the COPY JOB starts.

If the cluster is paused, COPY JOBS are not run.

To query COPY command files loaded and load errors, see [STL\$1LOAD\$1COMMITS](r_STL_LOAD_COMMITS.md), [STL\$1LOAD\$1ERRORS](r_STL_LOAD_ERRORS.md), [STL\$1LOADERROR\$1DETAIL](r_STL_LOADERROR_DETAIL.md). For more information, see [Verifying that the data loaded correctly](verifying-that-data-loaded-correctly.md).

COPY JOBS are not supported on zero-ETL databases as they operate in read-only mode.

## Examples
Examples

The following example shows creating a COPY JOB to load data from an Amazon S3 bucket. 

```
COPY public.target_table
FROM 's3://amzn-s3-demo-bucket/staging-folder'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyLoadRoleName' 
JOB CREATE my_copy_job_name
AUTO ON;
```

# COPY with TEMPLATE
COPY with TEMPLATE

You can use Redshift templates with COPY commands to simplify command syntax and ensure consistency across data loading operations. Instead of specifying the same formatting parameters repeatedly, you define them once in a template and reference the template in your COPY commands. When you use a template, the COPY command combines the parameters from the template with any parameters specified directly in the command. If the same parameter appears in both the template and the command, the command parameter takes precedence. For more information, see [CREATE TEMPLATE](r_CREATE_TEMPLATE.md). 

Templates for the COPY command can be created with:
+ [Data format parameters](copy-parameters-data-format.md)
+ [File compression parameters](copy-parameters-file-compression.md)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

For a complete list of supported parameters, see [COPY](r_COPY.md) command.

## Required permission
Required permission

To use a template in a COPY command, you must have:
+ All required permissions to execute the COPY command (see [Required permissions](r_COPY.md#r_COPY-permissions) )
+ One of the following template permissions:
  + Superuser privileges
  + USAGE privilege on the template and USAGE privilege on the schema containing the template

## Syntax
Syntax

```
COPY target_table FROM 's3://...'
authorization
[ option, ...]
USING TEMPLATE [database_name.][schema_name.]template_name;
```

## Parameters
Parameters

 *database\$1name*   
(Optional) The name of the database where the template exists. If not specified, the current database is used.

 *schema\$1name*   
(Optional) The name of the schema where the template exists. If not specified, the template is searched for in the current search path.

 *template\$1name*   
The name of the template to use in COPY. 

## Usage notes
Usage notes
+ Command-specific parameters (source, destination, authorization) must still be specified in the COPY command.
+ Templates cannot contain manifest file specifications for COPY commands.

## Examples
Examples

The following examples show how to create a template and use it in COPY commands:

```
CREATE TEMPLATE public.test_template FOR COPY AS
CSV DELIMITER '|' IGNOREHEADER 1 MAXERROR 100;

COPY public.target_table
FROM 's3://amzn-s3-demo-bucket/staging-folder'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyLoadRoleName'
USING TEMPLATE public.test_template;
```

When a parameter exists in both the template and the command, the command parameter takes precedence. In this example, if the template `public.test_template` contains `DELIMITER '|'` but the COPY command specifies `DELIMITER ','`, the comma delimiter (`,`) from the command will be used instead of the pipe delimiter (`|`) from the template. 

```
COPY public.target_table
FROM 's3://amzn-s3-demo-bucket/staging-folder'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyLoadRoleName'
DELIMITER ','
USING TEMPLATE public.test_template;
```

# COPY parameter reference
COPY parameter reference

COPY has many parameters that can be used in many situations. However, not all parameters are supported in each situation. For example, to load from ORC or PARQUET files there is a limited number of supported parameters. For more information, see [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md).

**Topics**
+ [

# Data sources
](copy-parameters-data-source.md)
+ [

# Authorization parameters
](copy-parameters-authorization.md)
+ [

# Column mapping options
](copy-parameters-column-mapping.md)
+ [

# Data format parameters
](copy-parameters-data-format.md)
+ [

# File compression parameters
](copy-parameters-file-compression.md)
+ [

# Data conversion parameters
](copy-parameters-data-conversion.md)
+ [

# Data load operations
](copy-parameters-data-load.md)
+ [

# Alphabetical parameter list
](r_COPY-alphabetical-parm-list.md)

# Data sources


You can load data from text files in an Amazon S3 bucket, in an Amazon EMR cluster, or on a remote host that your cluster can access using an SSH connection. You can also load data directly from a DynamoDB table. 

The maximum size of a single input row from any source is 4 MB. 

To export data from a table to a set of files in an Amazon S3, use the [UNLOAD](r_UNLOAD.md) command. 

**Topics**
+ [

# COPY from Amazon S3
](copy-parameters-data-source-s3.md)
+ [

# COPY from Amazon EMR
](copy-parameters-data-source-emr.md)
+ [

# COPY from remote host (SSH)
](copy-parameters-data-source-ssh.md)
+ [

# COPY from Amazon DynamoDB
](copy-parameters-data-source-dynamodb.md)

# COPY from Amazon S3


To load data from files located in one or more S3 buckets, use the FROM clause to indicate how COPY locates the files in Amazon S3. You can provide the object path to the data files as part of the FROM clause, or you can provide the location of a manifest file that contains a list of Amazon S3 object paths. COPY from Amazon S3 uses an HTTPS connection. Ensure that the S3 IP ranges are added to your allow list. To learn more about the required S3 IP ranges, see [ Network isolation](https://docs.amazonaws.cn//redshift/latest/mgmt/security-network-isolation.html#network-isolation).

**Important**  
If the Amazon S3 buckets that hold the data files don't reside in the same Amazon Region as your cluster, you must use the [REGION](#copy-region) parameter to specify the Region in which the data is located. 

**Topics**
+ [

## Syntax
](#copy-parameters-data-source-s3-syntax)
+ [

## Examples
](#copy-parameters-data-source-s3-examples)
+ [

## Optional parameters
](#copy-parameters-data-source-s3-optional-parms)
+ [

## Unsupported parameters
](#copy-parameters-data-source-s3-unsupported-parms)

## Syntax


```
FROM { 's3://objectpath' | 's3://manifest_file' }
authorization
| MANIFEST
| ENCRYPTED
| REGION [AS] 'aws-region'
| optional-parameters
```

## Examples


The following example uses an object path to load data from Amazon S3. 

```
copy customer
from 's3://amzn-s3-demo-bucket/customer' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The following example uses a manifest file to load data from Amazon S3. 

```
copy customer
from 's3://amzn-s3-demo-bucket/cust.manifest' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
manifest;
```

### Parameters


FROM  <a name="copy-parameters-from"></a>
The source of the data to be loaded. For more information about the encoding of the Amazon S3 file, see [Data conversion parameters](copy-parameters-data-conversion.md).

's3://*copy\$1from\$1s3\$1objectpath*'   <a name="copy-s3-objectpath"></a>
Specifies the path to the Amazon S3 objects that contain the data—for example, `'s3://amzn-s3-demo-bucket/custdata.txt'`. The *s3://copy\$1from\$1s3\$1objectpath* parameter can reference a single file or a set of objects or folders that have the same key prefix. For example, the name `custdata.txt` is a key prefix that refers to a number of physical files: `custdata.txt`,`custdata.txt.1`, `custdata.txt.2`, `custdata.txt.bak`,and so on. The key prefix can also reference a number of folders. For example, `'s3://amzn-s3-demo-bucket/custfolder'` refers to the folders `custfolder`, `custfolder_1`, `custfolder_2`, and so on. If a key prefix references multiple folders, all of the files in the folders are loaded. If a key prefix matches a file as well as a folder, such as `custfolder.log`, COPY attempts to load the file also. If a key prefix might result in COPY attempting to load unwanted files, use a manifest file. For more information, see [copy_from_s3_manifest_file](#copy-manifest-file), following.   
If the S3 bucket that holds the data files doesn't reside in the same Amazon Region as your cluster, you must use the [REGION](#copy-region) parameter to specify the Region in which the data is located.
For more information, see [Loading data from Amazon S3](t_Loading-data-from-S3.md).

's3://*copy\$1from\$1s3\$1manifest\$1file*'   <a name="copy-manifest-file"></a>
Specifies the Amazon S3 object key for a manifest file that lists the data files to be loaded. The *'s3://*copy\$1from\$1s3\$1manifest\$1file'** argument must explicitly reference a single file—for example, `'s3://amzn-s3-demo-bucket/manifest.txt'`. It can't reference a key prefix.  
The manifest is a text file in JSON format that lists the URL of each file that is to be loaded from Amazon S3. The URL includes the bucket name and full object path for the file. The files that are specified in the manifest can be in different buckets, but all the buckets must be in the same Amazon Region as the Amazon Redshift cluster. If a file is listed twice, the file is loaded twice. The following example shows the JSON for a manifest that loads three files.   

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket1/custdata.1","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket1/custdata.2","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket2/custdata.1","mandatory":false}
  ]
}
```
The double quotation mark characters are required, and must be simple quotation marks (0x22), not slanted or "smart" quotation marks. Each entry in the manifest can optionally include a `mandatory` flag. If `mandatory` is set to `true`, COPY terminates if it doesn't find the file for that entry; otherwise, COPY will continue. The default value for `mandatory` is `false`.   
When loading from data files in ORC or Parquet format, a `meta` field is required, as shown in the following example.  

```
{  
   "entries":[  
      {  
         "url":"s3://amzn-s3-demo-bucket1/orc/2013-10-04-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      },
      {  
         "url":"s3://amzn-s3-demo-bucket2/orc/2013-10-05-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      }
   ]
}
```
The manifest file must not be encrypted or compressed, even if the ENCRYPTED, GZIP, LZOP, BZIP2, or ZSTD options are specified. COPY returns an error if the specified manifest file isn't found or the manifest file isn't properly formed.   
If a manifest file is used, the MANIFEST parameter must be specified with the COPY command. If the MANIFEST parameter isn't specified, COPY assumes that the file specified with FROM is a data file.   
For more information, see [Loading data from Amazon S3](t_Loading-data-from-S3.md).

*authorization*  
The COPY command needs authorization to access data in another Amazon resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an Amazon Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

MANIFEST  <a name="copy-manifest"></a>
Specifies that a manifest is used to identify the data files to be loaded from Amazon S3. If the MANIFEST parameter is used, COPY loads data from the files listed in the manifest referenced by *'s3://copy\$1from\$1s3\$1manifest\$1file'*. If the manifest file isn't found, or isn't properly formed, COPY fails. For more information, see [Using a manifest to specify data files](loading-data-files-using-manifest.md).

ENCRYPTED  <a name="copy-encrypted"></a>
A clause that specifies that the input files on Amazon S3 are encrypted using client-side encryption with customer managed keys. For more information, see [Loading encrypted data files from Amazon S3](c_loading-encrypted-files.md). Don't specify ENCRYPTED if the input files are encrypted using Amazon S3 server-side encryption (SSE-KMS or SSE-S3). COPY reads server-side encrypted files automatically.  
If you specify the ENCRYPTED parameter, you must also specify the [MASTER_SYMMETRIC_KEY](#copy-master-symmetric-key) parameter or include the **master\$1symmetric\$1key** value in the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) string.  
If the encrypted files are in compressed format, add the GZIP, LZOP, BZIP2, or ZSTD parameter.  
Manifest files and JSONPaths files must not be encrypted, even if the ENCRYPTED option is specified.

MASTER\$1SYMMETRIC\$1KEY '*root\$1key*'  <a name="copy-master-symmetric-key"></a>
The root symmetric key that was used to encrypt data files on Amazon S3. If MASTER\$1SYMMETRIC\$1KEY is specified, the [ENCRYPTED](#copy-encrypted) parameter must also be specified. MASTER\$1SYMMETRIC\$1KEY can't be used with the CREDENTIALS parameter. For more information, see [Loading encrypted data files from Amazon S3](c_loading-encrypted-files.md).  
If the encrypted files are in compressed format, add the GZIP, LZOP, BZIP2, or ZSTD parameter.

REGION [AS] '*aws-region*'  <a name="copy-region"></a>
Specifies the Amazon Region where the source data is located. REGION is required for COPY from an Amazon S3 bucket or an DynamoDB table when the Amazon resource that contains the data isn't in the same Region as the Amazon Redshift cluster.   
The value for *aws\$1region* must match a Region listed in the [Amazon Redshift regions and endpoints](https://docs.amazonaws.cn/general/latest/gr/rande.html#redshift_region) table.  
If the REGION parameter is specified, all resources, including a manifest file or multiple Amazon S3 buckets, must be located in the specified Region.   
Transferring data across Regions incurs additional charges against the Amazon S3 bucket or the DynamoDB table that contains the data. For more information about pricing, see **Data Transfer OUT From Amazon S3 To Another Amazon Region** on the [Amazon S3 Pricing](http://www.amazonaws.cn/s3/pricing/) page and **Data Transfer OUT** on the [Amazon DynamoDB Pricing](http://www.amazonaws.cn/dynamodb/pricing/) page. 
By default, COPY assumes that the data is located in the same Region as the Amazon Redshift cluster. 

## Optional parameters


You can optionally specify the following parameters with COPY from Amazon S3: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md#copy-data-format-parameters)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters


You can't use the following parameters with COPY from Amazon S3: 
+ SSH
+ READRATIO

# COPY from Amazon EMR


You can use the COPY command to load data in parallel from an Amazon EMR cluster configured to write text files to the cluster's Hadoop Distributed File System (HDFS) in the form of fixed-width files, character-delimited files, CSV files, JSON-formatted files, or Avro files.

**Topics**
+ [

## Syntax
](#copy-parameters-data-source-emr-syntax)
+ [

## Example
](#copy-parameters-data-source-emr-example)
+ [

## Parameters
](#copy-parameters-data-source-emr-parameters)
+ [

## Supported parameters
](#copy-parameters-data-source-emr-optional-parms)
+ [

## Unsupported parameters
](#copy-parameters-data-source-emr-unsupported-parms)

## Syntax


```
FROM 'emr://emr_cluster_id/hdfs_filepath'  
authorization
[ optional_parameters ]
```

## Example


The following example loads data from an Amazon EMR cluster. 

```
copy sales
from 'emr://j-SAMPLE2B500FC/myoutput/part-*' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

## Parameters


FROM  
The source of the data to be loaded. 

 'emr://*emr\$1cluster\$1id*/*hdfs\$1file\$1path*'  <a name="copy-emr"></a>
The unique identifier for the Amazon EMR cluster and the HDFS file path that references the data files for the COPY command. The HDFS data file names must not contain the wildcard characters asterisk (\$1) and question mark (?).   
The Amazon EMR cluster must continue running until the COPY operation completes. If any of the HDFS data files are changed or deleted before the COPY operation completes, you might have unexpected results, or the COPY operation might fail. 
You can use the wildcard characters asterisk (\$1) and question mark (?) as part of the *hdfs\$1file\$1path* argument to specify multiple files to be loaded. For example, `'emr://j-SAMPLE2B500FC/myoutput/part*'` identifies the files `part-0000`, `part-0001`, and so on. If the file path doesn't contain wildcard characters, it is treated as a string literal. If you specify only a folder name, COPY attempts to load all files in the folder.   
If you use wildcard characters or use only the folder name, verify that no unwanted files will be loaded. For example, some processes might write a log file to the output folder.
For more information, see [Loading data from Amazon EMR](loading-data-from-emr.md).

*authorization*  
The COPY command needs authorization to access data in another Amazon resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an Amazon Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

## Supported parameters


You can optionally specify the following parameters with COPY from Amazon EMR: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md#copy-data-format-parameters)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters


You can't use the following parameters with COPY from Amazon EMR: 
+ ENCRYPTED
+ MANIFEST
+ REGION
+ READRATIO
+ SSH

# COPY from remote host (SSH)


You can use the COPY command to load data in parallel from one or more remote hosts, such Amazon Elastic Compute Cloud (Amazon EC2) instances or other computers. COPY connects to the remote hosts using Secure Shell (SSH) and runs commands on the remote hosts to generate text output. The remote host can be an EC2 Linux instance or another Unix or Linux computer configured to accept SSH connections. Amazon Redshift can connect to multiple hosts, and can open multiple SSH connections to each host. Amazon Redshift sends a unique command through each connection to generate text output to the host's standard output, which Amazon Redshift then reads as it does a text file.

Use the FROM clause to specify the Amazon S3 object key for the manifest file that provides the information COPY uses to open SSH connections and run the remote commands. 

**Topics**
+ [

## Syntax
](#copy-parameters-data-source-ssh-syntax)
+ [

## Examples
](#copy-parameters-data-source-ssh-examples)
+ [

## Parameters
](#copy-parameters-data-source-ssh-parameters)
+ [

## Optional parameters
](#copy-parameters-data-source-ssh-optional-parms)
+ [

## Unsupported parameters
](#copy-parameters-data-source-ssh-unsupported-parms)

**Important**  
 If the S3 bucket that holds the manifest file doesn't reside in the same Amazon Region as your cluster, you must use the REGION parameter to specify the Region in which the bucket is located. 

## Syntax


```
FROM 's3://'ssh_manifest_file' }
authorization
SSH
| optional-parameters
```

## Examples


The following example uses a manifest file to load data from a remote host using SSH. 

```
copy sales
from 's3://amzn-s3-demo-bucket/ssh_manifest' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
ssh;
```

## Parameters


FROM  
The source of the data to be loaded. 

's3://*copy\$1from\$1ssh\$1manifest\$1file*'  <a name="copy-ssh-manifest"></a>
The COPY command can connect to multiple hosts using SSH, and can create multiple SSH connections to each host. COPY runs a command through each host connection, and then loads the output from the commands in parallel into the table. The *s3://copy\$1from\$1ssh\$1manifest\$1file* argument specifies the Amazon S3 object key for the manifest file that provides the information COPY uses to open SSH connections and run the remote commands.  
The *s3://copy\$1from\$1ssh\$1manifest\$1file* argument must explicitly reference a single file; it can't be a key prefix. The following shows an example:  

```
's3://amzn-s3-demo-bucket/ssh_manifest.txt'
```
The manifest file is a text file in JSON format that Amazon Redshift uses to connect to the host. The manifest file specifies the SSH host endpoints and the commands that will be run on the hosts to return data to Amazon Redshift. Optionally, you can include the host public key, the login user name, and a mandatory flag for each entry. The following example shows a manifest file that creates two SSH connections:   

```
{ 
    "entries": [ 
	    {"endpoint":"<ssh_endpoint_or_IP>", 
           "command": "<remote_command>",
           "mandatory":true, 
           "publickey": "<public_key>", 
           "username": "<host_user_name>"}, 
	    {"endpoint":"<ssh_endpoint_or_IP>", 
           "command": "<remote_command>",
           "mandatory":true, 
           "publickey": "<public_key>", 
           "username": "<host_user_name>"} 
     ] 
}
```
The manifest file contains one `"entries"` construct for each SSH connection. You can have multiple connections to a single host or multiple connections to multiple hosts. The double quotation mark characters are required as shown, both for the field names and the values. The quotation mark characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks. The only value that doesn't need double quotation mark characters is the Boolean value `true` or `false` for the `"mandatory"` field.   
The following list describes the fields in the manifest file.     
endpoint  <a name="copy-ssh-manifest-endpoint"></a>
The URL address or IP address of the host—for example, `"ec2-111-222-333.compute-1.amazonaws.com"`, or `"198.51.100.0"`.   
command  <a name="copy-ssh-manifest-command"></a>
The command to be run by the host to generate text output or binary output in gzip, lzop, bzip2, or zstd format. The command can be any command that the user *"host\$1user\$1name"* has permission to run. The command can be as simple as printing a file, or it can query a database or launch a script. The output (text file, gzip binary file, lzop binary file, or bzip2 binary file) must be in a form that the Amazon Redshift COPY command can ingest. For more information, see [Preparing your input data](t_preparing-input-data.md).  
publickey  <a name="copy-ssh-manifest-publickey"></a>
(Optional) The public key of the host. If provided, Amazon Redshift will use the public key to identify the host. If the public key isn't provided, Amazon Redshift will not attempt host identification. For example, if the remote host's public key is `ssh-rsa AbcCbaxxx…Example root@amazon.com`, type the following text in the public key field: `"AbcCbaxxx…Example"`  
mandatory  <a name="copy-ssh-manifest-mandatory"></a>
(Optional) A clause that indicates whether the COPY command should fail if the connection attempt fails. The default is `false`. If Amazon Redshift doesn't successfully make at least one connection, the COPY command fails.  
username  <a name="copy-ssh-manifest-username"></a>
(Optional) The user name that will be used to log on to the host system and run the remote command. The user login name must be the same as the login that was used to add the Amazon Redshift cluster's public key to the host's authorized keys file. The default username is `redshift`.
For more information about creating a manifest file, see [Loading data process](loading-data-from-remote-hosts.md#load-from-host-process).  
To COPY from a remote host, the SSH parameter must be specified with the COPY command. If the SSH parameter isn't specified, COPY assumes that the file specified with FROM is a data file and will fail.   
If you use automatic compression, the COPY command performs two data read operations, which means it will run the remote command twice. The first read operation is to provide a data sample for compression analysis, then the second read operation actually loads the data. If executing the remote command twice might cause a problem, you should disable automatic compression. To disable automatic compression, run the COPY command with the COMPUPDATE parameter set to OFF. For more information, see [Loading tables with automatic compression](c_Loading_tables_auto_compress.md).  
For detailed procedures for using COPY from SSH, see [Loading data from remote hosts](loading-data-from-remote-hosts.md).

*authorization*  
The COPY command needs authorization to access data in another Amazon resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an Amazon Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

SSH  <a name="copy-ssh"></a>
A clause that specifies that data is to be loaded from a remote host using the SSH protocol. If you specify SSH, you must also provide a manifest file using the [s3://copy_from_ssh_manifest_file](#copy-ssh-manifest) argument.   
If you are using SSH to copy from a host using a private IP address in a remote VPC, the VPC must have enhanced VPC routing enabled. For more information about Enhanced VPC routing, see [Amazon Redshift Enhanced VPC Routing](https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-routing.html).

## Optional parameters


You can optionally specify the following parameters with COPY from SSH: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md#copy-data-format-parameters)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters


You can't use the following parameters with COPY from SSH: 
+ ENCRYPTED
+ MANIFEST
+ READRATIO

# COPY from Amazon DynamoDB


To load data from an existing DynamoDB table, use the FROM clause to specify the DynamoDB table name.

**Topics**
+ [

## Syntax
](#copy-parameters-data-source-dynamodb-syntax)
+ [

## Examples
](#copy-parameters-data-source-dynamodb-examples)
+ [

## Optional parameters
](#copy-parameters-data-source-dynamodb-optional-parms)
+ [

## Unsupported parameters
](#copy-parameters-data-source-dynamodb-unsupported-parms)

**Important**  
If the DynamoDB table doesn't reside in the same region as your Amazon Redshift cluster, you must use the REGION parameter to specify the region in which the data is located. 

## Syntax


```
FROM 'dynamodb://table-name' 
authorization
READRATIO ratio
| REGION [AS] 'aws_region'  
| optional-parameters
```

## Examples


The following example loads data from a DynamoDB table. 

```
copy favoritemovies from 'dynamodb://ProductCatalog'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
readratio 50;
```

### Parameters


FROM  
The source of the data to be loaded. 

'dynamodb://*table-name*'  <a name="copy-dynamodb"></a>
The name of the DynamoDB table that contains the data, for example `'dynamodb://ProductCatalog'`. For details about how DynamoDB attributes are mapped to Amazon Redshift columns, see [Loading data from an Amazon DynamoDB table](t_Loading-data-from-dynamodb.md).  
A DynamoDB table name is unique to an Amazon account, which is identified by the Amazon access credentials.

*authorization*  
The COPY command needs authorization to access data in another Amazon resource, including in Amazon S3, Amazon EMR, DynamoDB, and Amazon EC2. You can provide that authorization by referencing an Amazon Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

READRATIO [AS] *ratio*  <a name="copy-readratio"></a>
The percentage of the DynamoDB table's provisioned throughput to use for the data load. READRATIO is required for COPY from DynamoDB. It can't be used with COPY from Amazon S3. We highly recommend setting the ratio to a value less than the average unused provisioned throughput. Valid values are integers 1–200.  
Setting READRATIO to 100 or higher enables Amazon Redshift to consume the entirety of the DynamoDB table's provisioned throughput, which seriously degrades the performance of concurrent read operations against the same table during the COPY session. Write traffic is unaffected. Values higher than 100 are allowed to troubleshoot rare scenarios when Amazon Redshift fails to fulfill the provisioned throughput of the table. If you load data from DynamoDB to Amazon Redshift on an ongoing basis, consider organizing your DynamoDB tables as a time series to separate live traffic from the COPY operation.

## Optional parameters


You can optionally specify the following parameters with COPY from Amazon DynamoDB: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ The following data conversion parameters are supported:
  + [ACCEPTANYDATE](copy-parameters-data-conversion.md#copy-acceptanydate) 
  + [BLANKSASNULL](copy-parameters-data-conversion.md#copy-blanksasnull) 
  + [DATEFORMAT](copy-parameters-data-conversion.md#copy-dateformat) 
  + [EMPTYASNULL](copy-parameters-data-conversion.md#copy-emptyasnull) 
  + [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec) 
  + [TIMEFORMAT](copy-parameters-data-conversion.md#copy-timeformat) 
  + [TRIMBLANKS](copy-parameters-data-conversion.md#copy-trimblanks) 
  + [TRUNCATECOLUMNS](copy-parameters-data-conversion.md#copy-truncatecolumns) 
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters


You can't use the following parameters with COPY from DynamoDB: 
+ All data format parameters
+ ESCAPE
+ FILLRECORD
+ IGNOREBLANKLINES
+ IGNOREHEADER
+ NULL
+ REMOVEQUOTES
+ ACCEPTINVCHARS
+ MANIFEST
+ ENCRYPTED

# Authorization parameters
Authorization parameters

The COPY command needs authorization to access data in another Amazon resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an [Amazon Identity and Access Management (IAM) role](https://docs.amazonaws.cn/IAM/latest/UserGuide/id_roles.html) that is attached to your cluster (*role-based access control*). You can encrypt your load data on Amazon S3. 

The following topics provide more details and examples of authentication options:
+ [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](copy-usage_notes-access-permissions.md#copy-usage_notes-iam-permissions)
+ [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based)
+ [Key-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-key-based)

Use one of the following to provide authorization for the COPY command: 
+ [Using the IAM\$1ROLE parameter](#copy-iam-role) parameter
+ [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](#copy-access-key-id) parameters
+ [Using the CREDENTIALS parameter](#copy-credentials) clause

## Using the IAM\$1ROLE parameter
Using IAM\$1ROLE

### IAM\$1ROLE


Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the COPY command runs. 

Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. If you specify IAM\$1ROLE, you can't use ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY, SESSION\$1TOKEN, or CREDENTIALS.

The following shows the syntax for the IAM\$1ROLE parameter. 

```
IAM_ROLE { default | 'arn:aws:iam::<Amazon Web Services account-id>:role/<role-name>' }
```

For more information, see [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based). 

## Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters
Using ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY

### ACCESS\$1KEY\$1ID, SECRET\$1ACCESS\$1KEY


This authorization method is not recommended. 

**Note**  
Instead of providing access credentials as plain text, we strongly recommend using role-based authentication by specifying the IAM\$1ROLE parameter. For more information, see [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based). 

### SESSION\$1TOKEN


The session token for use with temporary access credentials. When SESSION\$1TOKEN is specified, you must also use ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY to provide temporary access key credentials. If you specify SESSION\$1TOKEN you can't use IAM\$1ROLE or CREDENTIALS. For more information, see [Temporary security credentials](copy-usage_notes-access-permissions.md#r_copy-temporary-security-credentials) in the IAM User Guide.

**Note**  
Instead of creating temporary security credentials, we strongly recommend using role-based authentication. When you authorize using an IAM role, Amazon Redshift automatically creates temporary user credentials for each session. For more information, see [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based). 

The following shows the syntax for the SESSION\$1TOKEN parameter with the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters. 

```
ACCESS_KEY_ID '<access-key-id>'
SECRET_ACCESS_KEY '<secret-access-key>'
SESSION_TOKEN '<temporary-token>';
```

If you specify SESSION\$1TOKEN you can't use CREDENTIALS or IAM\$1ROLE. 

## Using the CREDENTIALS parameter
Using CREDENTIALS

### CREDENTIALS


A clause that indicates the method your cluster will use when accessing other Amazon resources that contain data files or manifest files. You can't use the CREDENTIALS parameter with IAM\$1ROLE or ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY.

The following shows the syntax for the CREDENTIALS parameter.

```
[WITH] CREDENTIALS [AS] 'credentials-args'
```

**Note**  
For increased flexibility, we recommend using the [IAM\$1ROLE](#copy-iam-role-iam) parameter instead of the CREDENTIALS parameter.

Optionally, if the [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) parameter is used, the *credentials-args* string also provides the encryption key.

The *credentials-args* string is case-sensitive and must not contain spaces.

The keywords WITH and AS are optional and are ignored.

You can specify either [role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based.phrase) or [key-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-key-based.phrase). In either case, the IAM role or user must have the permissions required to access the specified Amazon resources. For more information, see [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](copy-usage_notes-access-permissions.md#copy-usage_notes-iam-permissions). 

**Note**  
To safeguard your Amazon credentials and protect sensitive data, we strongly recommend using role-based access control. 

To specify role-based access control, provide the *credentials-args* string in the following format.

```
'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>'
```

To use temporary token credentials, you must provide the temporary access key ID, the temporary secret access key, and the temporary token. The *credentials-args* string is in the following format. 

```
CREDENTIALS
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key>;token=<temporary-token>'
```

A COPY command using role-based access control with temporary credentials would resemble the following sample statement: 

```
COPY customer FROM 's3://amzn-s3-demo-bucket/mydata' 
CREDENTIALS
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key-id>;token=<temporary-token>'
```

 For more information, see [Temporary security credentials](copy-usage_notes-access-permissions.md#r_copy-temporary-security-credentials).

If the [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) parameter is used, the *credentials-args* string is in the following format, where *<root-key>* is the value of the root key that was used to encrypt the files.

```
CREDENTIALS
'<credentials-args>;master_symmetric_key=<root-key>'
```

A COPY command using role-based access control with an encryption key would resemble the following sample statement:

```
COPY customer FROM 's3://amzn-s3-demo-bucket/mydata' 
CREDENTIALS 
'aws_iam_role=arn:aws:iam::<account-id>:role/<role-name>;master_symmetric_key=<root-key>'
```

# Column mapping options


By default, COPY inserts values into the target table's columns in the same order as fields occur in the data files. If the default column order will not work, you can specify a column list or use JSONPath expressions to map source data fields to the target columns. 
+ [Column List](#copy-column-list)
+ [JSONPaths File](#copy-column-mapping-jsonpaths)

## Column list


You can specify a comma-separated list of column names to load source data fields into specific target columns. The columns can be in any order in the COPY statement, but when loading from flat files, such as in an Amazon S3 bucket, their order must match the order of the source data. 

When loading from an Amazon DynamoDB table, order doesn't matter. The COPY command matches attribute names in the items retrieved from the DynamoDB table to column names in the Amazon Redshift table. For more information, see [Loading data from an Amazon DynamoDB table](t_Loading-data-from-dynamodb.md)

 The format for a column list is as follows.

```
COPY tablename (column1 [,column2, ...]) 
```

If a column in the target table is omitted from the column list, then COPY loads the target column's [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) expression.

If the target column doesn't have a default, then COPY attempts to load NULL.

If COPY attempts to assign NULL to a column that is defined as NOT NULL, the COPY command fails. 

If an [IDENTITY](r_CREATE_TABLE_NEW.md#identity-clause) column is included in the column list, then [EXPLICIT_IDS](copy-parameters-data-conversion.md#copy-explicit-ids) must also be specified; if an IDENTITY column is omitted, then EXPLICIT\$1IDS can't be specified. If no column list is specified, the command behaves as if a complete, in-order column list was specified, with IDENTITY columns omitted if EXPLICIT\$1IDS was also not specified.

If a column is defined with GENERATED BY DEFAULT AS IDENTITY, then it can be copied. Values are generated or updated with values that you supply. The EXPLICIT\$1IDS option isn't required. COPY doesn't update the identity high watermark. For more information, see [GENERATED BY DEFAULT AS IDENTITY](r_CREATE_TABLE_NEW.md#identity-generated-bydefault-clause). 

## JSONPaths file


When loading from data files in JSON or Avro format, COPY automatically maps the data elements in the JSON or Avro source data to the columns in the target table. It does so by matching field names in the Avro schema to column names in the target table or column list.

In some cases, your column names and field names don't match, or you need to map to deeper levels in the data hierarchy. In these cases, you can use a JSONPaths file to explicitly map JSON or Avro data elements to columns. 

For more information, see [JSONPaths file](copy-parameters-data-format.md#copy-json-jsonpaths). 

# Data format parameters


By default, the COPY command expects the source data to be character-delimited UTF-8 text. The default delimiter is a pipe character ( \$1 ). If the source data is in another format, use the following parameters to specify the data format: 
+ [FORMAT](#copy-format)
+ [CSV](#copy-csv)
+ [DELIMITER](#copy-delimiter) 
+ [FIXEDWIDTH](#copy-fixedwidth) 
+ [SHAPEFILE](#copy-shapefile) 
+ [AVRO](#copy-avro) 
+ [JSON format for COPY](#copy-json) 
+ [PARQUET](#copy-parquet) 
+ [ORC](#copy-orc) 

In addition to the standard data formats, COPY supports the following columnar data formats for COPY from Amazon S3: 
+ [ORC](#copy-orc) 
+ [PARQUET](#copy-parquet) 

COPY from columnar format is supported with certain restriction. For more information, see [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md). <a name="copy-data-format-parameters"></a>Data format parameters

FORMAT [AS]  <a name="copy-format"></a>
(Optional) Identifies data format keywords. The FORMAT arguments are described following.

CSV [ QUOTE [AS] *'quote\$1character'* ]  <a name="copy-csv"></a>
Enables use of CSV format in the input data. To automatically escape delimiters, newline characters, and carriage returns, enclose the field in the character specified by the QUOTE parameter. The default quotation mark character is a double quotation mark ( " ). When the quotation mark character is used within a field, escape the character with an additional quotation mark character. For example, if the quotation mark character is a double quotation mark, to insert the string `A "quoted" word` the input file should include the string `"A ""quoted"" word"`. When the CSV parameter is used, the default delimiter is a comma ( , ). You can specify a different delimiter by using the DELIMITER parameter.   
When a field is enclosed in quotation marks, white space between the delimiters and the quotation mark characters is ignored. If the delimiter is a white space character, such as a tab, the delimiter isn't treated as white space.  
CSV can't be used with FIXEDWIDTH, REMOVEQUOTES, or ESCAPE.     
QUOTE [AS] *'quote\$1character'*  <a name="copy-csv-quote"></a>
Optional. Specifies the character to be used as the quotation mark character when using the CSV parameter. The default is a double quotation mark ( " ). If you use the QUOTE parameter to define a quotation mark character other than double quotation mark, you don't need to escape double quotation marks within the field. The QUOTE parameter can be used only with the CSV parameter. The AS keyword is optional.

DELIMITER [AS] ['*delimiter\$1char*']   <a name="copy-delimiter"></a>
Specifies characters that are used to separate fields in the input file, such as a pipe character ( `|` ), a comma ( `,` ), a tab ( `\t` ), or multiple characters such as `|~|`. Non-printable characters are supported. Characters can also be represented in octal as their UTF-8 code units. For octal, use the format '\$1ddd', where 'd' is an octal digit (0–7). The default delimiter is a pipe character ( `|` ), unless the CSV parameter is used, in which case the default delimiter is a comma ( `,` ). The AS keyword is optional. DELIMITER can't be used with FIXEDWIDTH.

FIXEDWIDTH '*fixedwidth\$1spec*'   <a name="copy-fixedwidth"></a>
Loads the data from a file where each column width is a fixed length, rather than columns being separated by a delimiter. The *fixedwidth\$1spec* is a string that specifies a user-defined column label and column width. The column label can be either a text string or an integer, depending on what the user chooses. The column label has no relation to the column name. The order of the label/width pairs must match the order of the table columns exactly. FIXEDWIDTH can't be used with CSV or DELIMITER. In Amazon Redshift, the length of CHAR and VARCHAR columns is expressed in bytes, so be sure that the column width that you specify accommodates the binary length of multibyte characters when preparing the file to be loaded. For more information, see [Character types](r_Character_types.md).   
The format for *fixedwidth\$1spec* is shown following:   

```
'colLabel1:colWidth1,colLabel:colWidth2, ...'
```

SHAPEFILE [ SIMPLIFY [AUTO] [*'tolerance'*] ]  <a name="copy-shapefile"></a>
Enables use of SHAPEFILE format in the input data. By default, the first column of the shapefile is either a `GEOMETRY` or `IDENTITY` column. All subsequent columns follow the order specified in the shapefile.  
You can't use SHAPEFILE with FIXEDWIDTH, REMOVEQUOTES, or ESCAPE.   
To use `GEOGRAPHY` objects with `COPY FROM SHAPEFILE`, first ingest into a `GEOMETRY` column, and then cast the objects to `GEOGRAPHY` objects. .    
SIMPLIFY [*tolerance*]  <a name="copy-shapefile-simplify"></a>
(Optional) Simplifies all geometries during the ingestion process using the Ramer-Douglas-Peucker algorithm and the given tolerance.   
SIMPLIFY AUTO [*tolerance*]  <a name="copy-shapefile-simplify"></a>
(Optional) Simplifies only geometries that are larger than the maximum geometry size. This simplification uses the Ramer-Douglas-Peucker algorithm and the automatically calculated tolerance if this doesn't exceed the specified tolerance. This algorithm calculates the size to store objects within the tolerance specified. The *tolerance* value is optional.
For examples of loading shapefiles, see [Loading a shapefile into Amazon Redshift](r_COPY_command_examples.md#copy-example-spatial-copy-shapefile).

AVRO [AS] '*avro\$1option*'  <a name="copy-avro"></a>
Specifies that the source data is in Avro format.   
Avro format is supported for COPY from these services and protocols:  
+ Amazon S3 
+ Amazon EMR 
+ Remote hosts (SSH) 
Avro isn't supported for COPY from DynamoDB.   
Avro is a data serialization protocol. An Avro source file includes a schema that defines the structure of the data. The Avro schema type must be `record`. COPY accepts Avro files created using the default uncompressed codec as well as the `deflate` and `snappy` compression codecs. For more information about Avro, go to [Apache Avro](https://avro.apache.org/).   
Valid values for *avro\$1option* are as follows:  
+ `'auto'`
+ `'auto ignorecase'`
+ `'s3://jsonpaths_file'` 
The default is `'auto'`.  
COPY automatically maps the data elements in the Avro source data to the columns in the target table. It does so by matching field names in the Avro schema to column names in the target table. The matching is case-sensitive for `'auto'` and isn't case-sensitive for `'auto ignorecase'`.   
Column names in Amazon Redshift tables are always lowercase, so when you use the `'auto'` option, matching field names must also be lowercase. If the field names aren't all lowercase, you can use the `'auto ignorecase'` option. With the default `'auto'` argument, COPY recognizes only the first level of fields, or *outer fields*, in the structure.   
To explicitly map column names to Avro field names, you can use a [JSONPaths file](#copy-json-jsonpaths).   
By default, COPY attempts to match all columns in the target table to Avro field names. To load a subset of the columns, you can optionally specify a column list. If a column in the target table is omitted from the column list, COPY loads the target column's [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) expression. If the target column doesn't have a default, COPY attempts to load NULL. If a column is included in the column list and COPY doesn't find a matching field in the Avro data, COPY attempts to load NULL to the column.   
If COPY attempts to assign NULL to a column that is defined as NOT NULL, the COPY command fails.   
<a name="copy-avro-schema"></a>**Avro Schema**  
An Avro source data file includes a schema that defines the structure of the data. COPY reads the schema that is part of the Avro source data file to map data elements to target table columns. The following example shows an Avro schema.   

```
{
    "name": "person",
    "type": "record",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "guid", "type": "string"},
        {"name": "name", "type": "string"},
        {"name": "address", "type": "string"}]
}
```
The Avro schema is defined using JSON format. The top-level JSON object contains three name-value pairs with the names, or *keys*, `"name"`, `"type"`, and `"fields"`.   
The `"fields"` key pairs with an array of objects that define the name and data type of each field in the data structure. By default, COPY automatically matches the field names to column names. Column names are always lowercase, so matching field names must also be lowercase, unless you specify the `‘auto ignorecase’` option. Any field names that don't match a column name are ignored. Order doesn't matter. In the previous example, COPY maps to the column names `id`, `guid`, `name`, and `address`.   
With the default `'auto'` argument, COPY matches only the first-level objects to columns. To map to deeper levels in the schema, or if field names and column names don't match, use a JSONPaths file to define the mapping. For more information, see [JSONPaths file](#copy-json-jsonpaths).   
If the value associated with a key is a complex Avro data type such as byte, array, record, map, or link, COPY loads the value as a string. Here, the string is the JSON representation of the data. COPY loads Avro enum data types as strings, where the content is the name of the type. For an example, see [COPY from JSON format](copy-usage_notes-copy-from-json.md).  
The maximum size of the Avro file header, which includes the schema and file metadata, is 1 MB.     
The maximum size of a single Avro data block is 4 MB. This is distinct from the maximum row size. If the maximum size of a single Avro data block is exceeded, even if the resulting row size is less than the 4 MB row-size limit, the COPY command fails.   
In calculating row size, Amazon Redshift internally counts pipe characters ( \$1 ) twice. If your input data contains a very large number of pipe characters, it is possible for row size to exceed 4 MB even if the data block is less than 4 MB.

JSON [AS] '*json\$1option*'  <a name="copy-json"></a>
The source data is in JSON format.   
JSON format is supported for COPY from these services and protocols:  
+ Amazon S3
+ COPY from Amazon EMR
+ COPY from SSH
JSON isn't supported for COPY from DynamoDB.   
Valid values for *json\$1option* are as follows :  
+ `'auto'`
+ `'auto ignorecase'`
+ `'s3://jsonpaths_file'` 
+ `'noshred'` 
The default is `'auto'`. Amazon Redshift doesn't shred the attributes of JSON structures into multiple columns while loading a JSON document.  
By default, COPY attempts to match all columns in the target table to JSON field name keys. To load a subset of the columns, you can optionally specify a column list. If the JSON field name keys aren't all lowercase, you can use the `'auto ignorecase'` option or a [JSONPaths file](#copy-json-jsonpaths) to explicitly map column names to JSON field name keys.  
If a column in the target table is omitted from the column list, then COPY loads the target column's [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) expression. If the target column doesn't have a default, COPY attempts to load NULL. If a column is included in the column list and COPY doesn't find a matching field in the JSON data, COPY attempts to load NULL to the column.   
If COPY attempts to assign NULL to a column that is defined as NOT NULL, the COPY command fails.   
COPY maps the data elements in the JSON source data to the columns in the target table. It does so by matching *object keys*, or names, in the source name-value pairs to the names of columns in the target table.   
Refer to the following details about each *json\$1option* value:    
'auto'  <a name="copy-json-auto"></a>
With this option, matching is case-sensitive. Column names in Amazon Redshift tables are always lowercase, so when you use the `'auto'` option, matching JSON field names must also be lowercase.  
'auto ignorecase'  <a name="copy-json-auto-ignorecase"></a>
With this option, the matching isn't case-sensitive. Column names in Amazon Redshift tables are always lowercase, so when you use the `'auto ignorecase'` option, the corresponding JSON field names can be lowercase, uppercase, or mixed-case.   
's3://*jsonpaths\$1file*'  <a name="copy-json-pathfile"></a>
With this option, COPY uses the named JSONPaths file to map the data elements in the JSON source data to the columns in the target table. The *`s3://jsonpaths_file`* argument must be an Amazon S3 object key that explicitly references a single file. An example is `'s3://amzn-s3-demo-bucket/jsonpaths.txt`'. The argument can't be a key prefix. For more information about using a JSONPaths file, see [JSONPaths file](#copy-json-jsonpaths).  
In some cases, the file specified by `jsonpaths_file` has the same prefix as the path specified by `copy_from_s3_objectpath` for the data files. If so, COPY reads the JSONPaths file as a data file and returns errors. For example, suppose that your data files use the object path `s3://amzn-s3-demo-bucket/my_data.json` and your JSONPaths file is `s3://amzn-s3-demo-bucket/my_data.jsonpaths`. In this case, COPY attempts to load `my_data.jsonpaths` as a data file.  
'noshred'  <a name="copy-json-noshred"></a>
With this option, Amazon Redshift doesn't shred the attributes of JSON structures into multiple columns while loading a JSON document.

## JSON data file


The JSON data file contains a set of either objects or arrays. COPY loads each JSON object or array into one row in the target table. Each object or array corresponding to a row must be a stand-alone, root-level structure; that is, it must not be a member of another JSON structure.

A JSON *object* begins and ends with braces  ( \$1 \$1 ) and contains an unordered collection of name-value pairs. Each paired name and value are separated by a colon, and the pairs are separated by commas. By default, the *object key*, or name, in the name-value pairs must match the name of the corresponding column in the table. Column names in Amazon Redshift tables are always lowercase, so matching JSON field name keys must also be lowercase. If your column names and JSON keys don't match, use a [JSONPaths file](#copy-json-jsonpaths) to explicitly map columns to keys. 

Order in a JSON object doesn't matter. Any names that don't match a column name are ignored. The following shows the structure of a simple JSON object.

```
{
  "column1": "value1",
  "column2": value2,
  "notacolumn" : "ignore this value"
}
```

A JSON *array* begins and ends with brackets ( [  ] ), and contains an ordered collection of values separated by commas. If your data files use arrays, you must specify a JSONPaths file to match the values to columns. The following shows the structure of a simple JSON array. 

```
["value1", value2]
```

The JSON must be well-formed. For example, the objects or arrays can't be separated by commas or any other characters except white space. Strings must be enclosed in double quotation mark characters. Quote characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks.

The maximum size of a single JSON object or array, including braces or brackets, is 4 MB. This is distinct from the maximum row size. If the maximum size of a single JSON object or array is exceeded, even if the resulting row size is less than the 4 MB row-size limit, the COPY command fails. 

In calculating row size, Amazon Redshift internally counts pipe characters ( \$1 ) twice. If your input data contains a very large number of pipe characters, it is possible for row size to exceed 4 MB even if the object size is less than 4 MB.

COPY loads `\n` as a newline character and loads `\t` as a tab character. To load a backslash, escape it with a backslash ( `\\` ).

COPY searches the specified JSON source for a well-formed, valid JSON object or array. If COPY encounters any non–white-space characters before locating a usable JSON structure, or between valid JSON objects or arrays, COPY returns an error for each instance. These errors count toward the MAXERROR error count. When the error count equals or exceeds MAXERROR, COPY fails. 

For each error, Amazon Redshift records a row in the STL\$1LOAD\$1ERRORS system table. The LINE\$1NUMBER column records the last line of the JSON object that caused the error. 

If IGNOREHEADER is specified, COPY ignores the specified number of lines in the JSON data. Newline characters in the JSON data are always counted for IGNOREHEADER calculations. 

COPY loads empty strings as empty fields by default. If EMPTYASNULL is specified, COPY loads empty strings for CHAR and VARCHAR fields as NULL. Empty strings for other data types, such as INT, are always loaded with NULL. 

The following options aren't supported with JSON: 
+ CSV
+ DELIMITER 
+ ESCAPE
+ FILLRECORD 
+ FIXEDWIDTH
+ IGNOREBLANKLINES
+ NULL AS
+ READRATIO
+ REMOVEQUOTES 

For more information, see [COPY from JSON format](copy-usage_notes-copy-from-json.md). For more information about JSON data structures, go to [www.json.org](https://www.json.org/). 

## JSONPaths file


If you are loading from JSON-formatted or Avro source data, by default COPY maps the first-level data elements in the source data to the columns in the target table. It does so by matching each name, or object key, in a name-value pair to the name of a column in the target table. 

If your column names and object keys don't match, or to map to deeper levels in the data hierarchy, you can use a JSONPaths file to explicitly map JSON or Avro data elements to columns. The JSONPaths file maps JSON data elements to columns by matching the column order in the target table or column list.

The JSONPaths file must contain only a single JSON object (not an array). The JSON object is a name-value pair. The *object key*, which is the name in the name-value pair, must be `"jsonpaths"`. The *value* in the name-value pair is an array of *JSONPath expressions*. Each JSONPath expression references a single element in the JSON data hierarchy or Avro schema, similarly to how an XPath expression refers to elements in an XML document. For more information, see [JSONPath expressions](#copy-json-jsonpath-expressions).

To use a JSONPaths file, add the JSON or AVRO keyword to the COPY command. Specify the S3 bucket name and object path of the JSONPaths file using the following format.

```
COPY tablename 
FROM 'data_source' 
CREDENTIALS 'credentials-args' 
FORMAT AS { AVRO | JSON } 's3://jsonpaths_file';
```

The `s3://jsonpaths_file` value must be an Amazon S3 object key that explicitly references a single file, such as `'s3://amzn-s3-demo-bucket/jsonpaths.txt'`. It can't be a key prefix. 

In some cases, if you're loading from Amazon S3 the file specified by `jsonpaths_file` has the same prefix as the path specified by `copy_from_s3_objectpath` for the data files. If so, COPY reads the JSONPaths file as a data file and returns errors. For example, suppose that your data files use the object path `s3://amzn-s3-demo-bucket/my_data.json` and your JSONPaths file is `s3://amzn-s3-demo-bucket/my_data.jsonpaths`. In this case, COPY attempts to load `my_data.jsonpaths` as a data file.

 If the key name is any string other than `"jsonpaths"`, the COPY command doesn't return an error, but it ignores *jsonpaths\$1file* and uses the `'auto'` argument instead. 

If any of the following occurs, the COPY command fails:
+ The JSON is malformed.
+ There is more than one JSON object.
+ Any characters except white space exist outside the object.
+ An array element is an empty string or isn't a string.

MAXERROR doesn't apply to the JSONPaths file. 

The JSONPaths file must not be encrypted, even if the [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) option is specified.

For more information, see [COPY from JSON format](copy-usage_notes-copy-from-json.md). 

## JSONPath expressions


The JSONPaths file uses JSONPath expressions to map data fields to target columns. Each JSONPath expression corresponds to one column in the Amazon Redshift target table. The order of the JSONPath array elements must match the order of the columns in the target table or the column list, if a column list is used. 

The double quotation mark characters are required as shown, both for the field names and the values. The quotation mark characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks.

If an object element referenced by a JSONPath expression isn't found in the JSON data, COPY attempts to load a NULL value. If the referenced object is malformed, COPY returns a load error. 

If an array element referenced by a JSONPath expression isn't found in the JSON or Avro data, COPY fails with the following error: `Invalid JSONPath format: Not an array or index out of range.` Remove any array elements from the JSONPaths that don't exist in the source data and verify that the arrays in the source data are well formed.  

The JSONPath expressions can use either bracket notation or dot notation, but you can't mix notations. The following example shows JSONPath expressions using bracket notation. 

```
{
    "jsonpaths": [
        "$['venuename']",
        "$['venuecity']",
        "$['venuestate']",
        "$['venueseats']"
    ]
}
```

The following example shows JSONPath expressions using dot notation. 

```
{
    "jsonpaths": [
        "$.venuename",
        "$.venuecity",
        "$.venuestate",
        "$.venueseats"
    ]
}
```

In the context of Amazon Redshift COPY syntax, a JSONPath expression must specify the explicit path to a single name element in a JSON or Avro hierarchical data structure. Amazon Redshift doesn't support any JSONPath elements, such as wildcard characters or filter expressions, that might resolve to an ambiguous path or multiple name elements.

For more information, see [COPY from JSON format](copy-usage_notes-copy-from-json.md). 

## Using JSONPaths with Avro Data


The following example shows an Avro schema with multiple levels.

```
{
    "name": "person",
    "type": "record",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "guid", "type": "string"},
        {"name": "isActive", "type": "boolean"},
        {"name": "age", "type": "int"},
        {"name": "name", "type": "string"},
        {"name": "address", "type": "string"},
        {"name": "latitude", "type": "double"},
        {"name": "longitude", "type": "double"},
        {
            "name": "tags",
            "type": {
                        "type" : "array",
                        "name" : "inner_tags",
                        "items" : "string"
                    }
        },
        {
            "name": "friends",
            "type": {
                        "type" : "array",
                        "name" : "inner_friends",
                        "items" : {
                                    "name" : "friends_record",
                                    "type" : "record",
                                    "fields" : [
                                                 {"name" : "id", "type" : "int"},
                                                 {"name" : "name", "type" : "string"}
                                               ]
                                  }
                    }
        },
        {"name": "randomArrayItem", "type": "string"}
    ]
}
```

The following example shows a JSONPaths file that uses AvroPath expressions to reference the previous schema. 

```
{
    "jsonpaths": [
        "$.id",
        "$.guid",
        "$.address",
        "$.friends[0].id"
    ]
}
```

The JSONPaths example includes the following elements:

jsonpaths  
The name of the JSON object that contains the AvroPath expressions.

[ … ]  
Brackets enclose the JSON array that contains the path elements.

\$1  
The dollar sign refers to the root element in the Avro schema, which is the `"fields"` array.

"\$1.id",  
The target of the AvroPath expression. In this instance, the target is the element in the `"fields"` array with the name `"id"`. The expressions are separated by commas.

"\$1.friends[0].id"  
Brackets indicate an array index. JSONPath expressions use zero-based indexing, so this expression references the first element in the `"friends"` array with the name `"id"`.

The Avro schema syntax requires using *inner fields* to define the structure of record and array data types. The inner fields are ignored by the AvroPath expressions. For example, the field `"friends"` defines an array named `"inner_friends"`, which in turn defines a record named `"friends_record"`. The AvroPath expression to reference the field `"id"` can ignore the extra fields to reference the target field directly. The following AvroPath expressions reference the two fields that belong to the `"friends"` array.

```
"$.friends[0].id"
"$.friends[0].name"
```

## Columnar data format parameters


In addition to the standard data formats, COPY supports the following columnar data formats for COPY from Amazon S3. COPY from columnar format is supported with certain restrictions. For more information, see [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md). 

ORC  <a name="copy-orc"></a>
Loads the data from a file that uses Optimized Row Columnar (ORC) file format. 

PARQUET  <a name="copy-parquet"></a>
Loads the data from a file that uses Parquet file format. 

# File compression parameters


You can load from compressed data files by specifying the following parameters. File compression parameters

BZIP2   <a name="copy-bzip2"></a>
A value that specifies that the input file or files are in compressed bzip2 format (.bz2 files). The COPY operation reads each compressed file and uncompresses the data as it loads.

GZIP   <a name="copy-gzip"></a>
A value that specifies that the input file or files are in compressed gzip format (.gz files). The COPY operation reads each compressed file and uncompresses the data as it loads.

LZOP   <a name="copy-lzop"></a>
A value that specifies that the input file or files are in compressed lzop format (.lzo files). The COPY operation reads each compressed file and uncompresses the data as it loads.  
COPY doesn't support files that are compressed using the lzop *--filter* option.

ZSTD   <a name="copy-zstd"></a>
A value that specifies that the input file or files are in compressed Zstandard format (.zst files). The COPY operation reads each compressed file and uncompresses the data as it loads.  
ZSTD is supported only with COPY from Amazon S3.

# Data conversion parameters


As it loads the table, COPY attempts to implicitly convert the strings in the source data to the data type of the target column. If you need to specify a conversion that is different from the default behavior, or if the default conversion results in errors, you can manage data conversions by specifying the following parameters. For more information on the syntax of these parameters, see [COPY syntax](https://docs.amazonaws.cn/redshift/latest/dg/r_COPY.html#r_COPY-syntax).
+ [ACCEPTANYDATE](#copy-acceptanydate) 
+ [ACCEPTINVCHARS](#copy-acceptinvchars) 
+ [BLANKSASNULL](#copy-blanksasnull) 
+ [DATEFORMAT](#copy-dateformat) 
+ [EMPTYASNULL](#copy-emptyasnull) 
+ [ENCODING](#copy-encoding) 
+ [ESCAPE](#copy-escape) 
+ [EXPLICIT_IDS](#copy-explicit-ids) 
+ [FILLRECORD](#copy-fillrecord) 
+ [IGNOREBLANKLINES](#copy-ignoreblanklines) 
+ [IGNOREHEADER](#copy-ignoreheader) 
+ [NULL AS](#copy-null-as) 
+ [REMOVEQUOTES](#copy-removequotes) 
+ [ROUNDEC](#copy-roundec) 
+ [TIMEFORMAT](#copy-timeformat) 
+ [TRIMBLANKS](#copy-trimblanks) 
+ [TRUNCATECOLUMNS](#copy-truncatecolumns) <a name="copy-data-conversion-parameters"></a>Data conversion parameters

ACCEPTANYDATE   <a name="copy-acceptanydate"></a>
Allows any date format, including invalid formats such as `00/00/00 00:00:00`, to be loaded without generating an error. This parameter applies only to TIMESTAMP and DATE columns. Always use ACCEPTANYDATE with the DATEFORMAT parameter. If the date format for the data doesn't match the DATEFORMAT specification, Amazon Redshift inserts a NULL value into that field.

ACCEPTINVCHARS [AS] ['*replacement\$1char*']   <a name="copy-acceptinvchars"></a>
Enables loading of data into VARCHAR columns even if the data contains invalid UTF-8 characters. When ACCEPTINVCHARS is specified, COPY replaces each invalid UTF-8 character with a string of equal length consisting of the character specified by *replacement\$1char*. For example, if the replacement character is '`^`', an invalid three-byte character will be replaced with '`^^^`'.  
 The replacement character can be any ASCII character except NULL. The default is a question mark ( ? ). For information about invalid UTF-8 characters, see [Multibyte character load errors](multi-byte-character-load-errors.md).  
COPY returns the number of rows that contained invalid UTF-8 characters, and it adds an entry to the [STL\$1REPLACEMENTS](r_STL_REPLACEMENTS.md) system table for each affected row, up to a maximum of 100 rows for each node slice. Additional invalid UTF-8 characters are also replaced, but those replacement events aren't recorded.  
If ACCEPTINVCHARS isn't specified, COPY returns an error whenever it encounters an invalid UTF-8 character.   
ACCEPTINVCHARS is valid only for VARCHAR columns.

BLANKSASNULL   <a name="copy-blanksasnull"></a>
Loads blank fields, which consist of only white space characters, as NULL. This option applies only to CHAR and VARCHAR columns. Blank fields for other data types, such as INT, are always loaded with NULL. For example, a string that contains three space characters in succession (and no other characters) is loaded as a NULL. The default behavior, without this option, is to load the space characters as is. 

DATEFORMAT [AS] \$1'*dateformat\$1string*' \$1 'auto' \$1  <a name="copy-dateformat"></a>
If no DATEFORMAT is specified, the default format is `'YYYY-MM-DD'`. For example, an alternative valid format is `'MM-DD-YYYY'`.   
If the COPY command doesn't recognize the format of your date or time values, or if your date or time values use different formats, use the `'auto'` argument with the DATEFORMAT or TIMEFORMAT parameter. The `'auto'` argument recognizes several formats that aren't supported when using a DATEFORMAT and TIMEFORMAT string. The `'auto'`' keyword is case-sensitive. For more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md).   
The date format can include time information (hour, minutes, seconds), but this information is ignored. The AS keyword is optional. For more information, see [DATEFORMAT and TIMEFORMAT stringsExample](r_DATEFORMAT_and_TIMEFORMAT_strings.md).

EMPTYASNULL   <a name="copy-emptyasnull"></a>
Indicates that Amazon Redshift should load empty CHAR and VARCHAR fields as NULL. Empty fields for other data types, such as INT, are always loaded with NULL. Empty fields occur when data contains two delimiters in succession with no characters between the delimiters. EMPTYASNULL and NULL AS '' (empty string) produce the same behavior.

ENCODING [AS] *file\$1encoding*  <a name="copy-encoding"></a>
Specifies the encoding type of the load data. The COPY command converts the data from the specified encoding into UTF-8 during loading.   
Valid values for *file\$1encoding* are as follows:  
+ `UTF8`
+ `UTF16`
+ `UTF16LE`
+ `UTF16BE`
+ `ISO88591`
The default is `UTF8`.  
Source file names must use UTF-8 encoding.  
The following files must use UTF-8 encoding, even if a different encoding is specified for the load data:  
+ Manifest files
+ JSONPaths files
The argument strings provided with the following parameters must use UTF-8:  
+ FIXEDWIDTH '*fixedwidth\$1spec*'
+ ACCEPTINVCHARS '*replacement\$1char*'
+ DATEFORMAT '*dateformat\$1string*'
+ TIMEFORMAT '*timeformat\$1string*'
+ NULL AS '*null\$1string*'
Fixed-width data files must use UTF-8 encoding. The field widths are based on the number of characters, not the number of bytes.   
All load data must use the specified encoding. If COPY encounters a different encoding, it skips the file and returns an error.   
If you specify `UTF16`, then your data must have a byte order mark (BOM). If you know whether your UTF-16 data is little-endian (LE) or big-endian (BE), you can use `UTF16LE` or `UTF16BE`, regardless of the presence of a BOM.   
To use ISO-8859-1 encoding specify `ISO88591`. For more information, see [ISO/IEC 8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) in *Wikipedia*.

ESCAPE   <a name="copy-escape"></a>
When this parameter is specified, the backslash character (`\`) in input data is treated as an escape character. The character that immediately follows the backslash character is loaded into the table as part of the current column value, even if it is a character that normally serves a special purpose. For example, you can use this parameter to escape the delimiter character, a quotation mark, an embedded newline character, or the escape character itself when any of these characters is a legitimate part of a column value.  
If you specify the ESCAPE parameter in combination with the REMOVEQUOTES parameter, you can escape and retain quotation marks (`'` or `"`) that might otherwise be removed. The default null string, `\N`, works as is, but it can also be escaped in the input data as `\\N`. As long as you don't specify an alternative null string with the NULL AS parameter, `\N` and `\\N` produce the same results.  
The control character `0x00` (NUL) can't be escaped and should be removed from the input data or converted. This character is treated as an end of record (EOR) marker, causing the remainder of the record to be truncated.
You can't use the ESCAPE parameter for FIXEDWIDTH loads, and you can't specify the escape character itself; the escape character is always the backslash character. Also, you must ensure that the input data contains the escape character in the appropriate places.  
Here are some examples of input data and the resulting loaded data when the ESCAPE parameter is specified. The result for row 4 assumes that the REMOVEQUOTES parameter is also specified. The input data consists of two pipe-delimited fields:   

```
1|The quick brown fox\[newline]
jumped over the lazy dog.
2| A\\B\\C
3| A \| B \| C
4| 'A Midsummer Night\'s Dream'
```
The data loaded into column 2 looks like this:   

```
The quick brown fox
jumped over the lazy dog.
A\B\C
A|B|C
A Midsummer Night's Dream
```
Applying the escape character to the input data for a load is the responsibility of the user. One exception to this requirement is when you reload data that was previously unloaded with the ESCAPE parameter. In this case, the data will already contain the necessary escape characters.
The ESCAPE parameter doesn't interpret octal, hex, Unicode, or other escape sequence notation. For example, if your source data contains the octal line feed value (`\012`) and you try to load this data with the ESCAPE parameter, Amazon Redshift loads the value `012` into the table and doesn't interpret this value as a line feed that is being escaped.  
In order to escape newline characters in data that originates from Microsoft Windows platforms, you might need to use two escape characters: one for the carriage return and one for the line feed. Alternatively, you can remove the carriage returns before loading the file (for example, by using the dos2unix utility).

EXPLICIT\$1IDS   <a name="copy-explicit-ids"></a>
Use EXPLICIT\$1IDS with tables that have IDENTITY columns if you want to override the autogenerated values with explicit values from the source data files for the tables. If the command includes a column list, that list must include the IDENTITY columns to use this parameter. The data format for EXPLICIT\$1IDS values must match the IDENTITY format specified by the CREATE TABLE definition.  
When you run a COPY command against a table with the EXPLICIT\$1IDS option, Amazon Redshift does not check the uniqueness of IDENTITY columns in the table.  
If a column is defined with GENERATED BY DEFAULT AS IDENTITY, then it can be copied. Values are generated or updated with values that you supply. The EXPLICIT\$1IDS option isn't required. COPY doesn't update the identity high watermark.  
 For an example of a COPY command using EXPLICIT\$1IDS, see [Load VENUE with explicit values for an IDENTITY column](r_COPY_command_examples.md#r_COPY_command_examples-load-venue-with-explicit-values-for-an-identity-column).

FILLRECORD   <a name="copy-fillrecord"></a>
Allows data files to be loaded when contiguous columns are missing at the end of some of the records. The missing columns are loaded as NULLs. For text and CSV formats, if the missing column is a VARCHAR column, zero-length strings are loaded instead of NULLs. To load NULLs to VARCHAR columns from text and CSV, specify the EMPTYASNULL keyword. NULL substitution only works if the column definition allows NULLs.  
For example, if the table definition contains four nullable CHAR columns, and a record contains the values `apple, orange, banana, mango`, the COPY command could load and fill in a record that contains only the values `apple, orange`. The missing CHAR values would be loaded as NULL values.

IGNOREBLANKLINES   <a name="copy-ignoreblanklines"></a>
Ignores blank lines that only contain a line feed in a data file and does not try to load them.

IGNOREHEADER [ AS ] *number\$1rows*   <a name="copy-ignoreheader"></a>
Treats the specified *number\$1rows* as a file header and doesn't load them. Use IGNOREHEADER to skip file headers in all files in a parallel load.

NULL AS '*null\$1string*'   <a name="copy-null-as"></a>
Loads fields that match *null\$1string* as NULL, where *null\$1string* can be any string. If your data includes a null terminator, also referred to as NUL (UTF-8 0000) or binary zero (0x000), COPY treats it as any other character. For example, a record containing '1' \$1\$1 NUL \$1\$1 '2' is copied as string of length 3 bytes. If a field contains only NUL, you can use NULL AS to replace the null terminator with NULL by specifying `'\0'` or `'\000'`—for example, `NULL AS '\0'` or `NULL AS '\000'`. If a field contains a string that ends with NUL and NULL AS is specified, the string is inserted with NUL at the end. Do not use '\$1n' (newline) for the *null\$1string* value. Amazon Redshift reserves '\$1n' for use as a line delimiter. The default *null\$1string* is `'\N`'.  
If you attempt to load nulls into a column defined as NOT NULL, the COPY command will fail.

REMOVEQUOTES   <a name="copy-removequotes"></a>
Removes surrounding quotation marks from strings in the incoming data. All characters within the quotation marks, including delimiters, are retained. If a string has a beginning single or double quotation mark but no corresponding ending mark, the COPY command fails to load that row and returns an error. The following table shows some simple examples of strings that contain quotation marks and the resulting loaded values.      
[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/copy-parameters-data-conversion.html)

ROUNDEC   <a name="copy-roundec"></a>
Rounds up numeric values when the scale of the input value is greater than the scale of the column. By default, COPY truncates values when necessary to fit the scale of the column. For example, if a value of `20.259` is loaded into a DECIMAL(8,2) column, COPY truncates the value to `20.25` by default. If ROUNDEC is specified, COPY rounds the value to `20.26`. The INSERT command always rounds values when necessary to match the column's scale, so a COPY command with the ROUNDEC parameter behaves the same as an INSERT command.

TIMEFORMAT [AS] \$1'*timeformat\$1string*' \$1 'auto' \$1 'epochsecs' \$1 'epochmillisecs' \$1  <a name="copy-timeformat"></a>
Specifies the time format. If no TIMEFORMAT is specified, the default format is `YYYY-MM-DD HH:MI:SS` for TIMESTAMP columns or `YYYY-MM-DD HH:MI:SSOF` for TIMESTAMPTZ columns, where `OF` is the offset from Coordinated Universal Time (UTC). You can't include a time zone specifier in the *timeformat\$1string*. To load TIMESTAMPTZ data that is in a format different from the default format, specify 'auto'; for more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md). For more information about *timeformat\$1string*, see [DATEFORMAT and TIMEFORMAT stringsExample](r_DATEFORMAT_and_TIMEFORMAT_strings.md).  
The `'auto'` argument recognizes several formats that aren't supported when using a DATEFORMAT and TIMEFORMAT string. If the COPY command doesn't recognize the format of your date or time values, or if your date and time values use formats different from each other, use the `'auto'` argument with the DATEFORMAT or TIMEFORMAT parameter. For more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md).   
If your source data is represented as epoch time, that is the number of seconds or milliseconds since January 1, 1970, 00:00:00 UTC, specify `'epochsecs'` or `'epochmillisecs'`.   
The `'auto'`, `'epochsecs'`, and `'epochmillisecs'` keywords are case-sensitive.  
The AS keyword is optional.

TRIMBLANKS   <a name="copy-trimblanks"></a>
Removes the trailing white space characters from a VARCHAR string. This parameter applies only to columns with a VARCHAR data type.

TRUNCATECOLUMNS   <a name="copy-truncatecolumns"></a>
Truncates data in columns to the appropriate number of characters so that it fits the column specification. Applies only to columns with a VARCHAR or CHAR data type, and rows 4 MB or less in size.

# Data load operations


Manage the default behavior of the load operation for troubleshooting or to reduce load times by specifying the following parameters. 
+ [COMPROWS](#copy-comprows) 
+ [COMPUPDATE](#copy-compupdate) 
+ [IGNOREALLERRORS](#copy-ignoreallerrors) 
+ [MAXERROR](#copy-maxerror) 
+ [NOLOAD](#copy-noload) 
+ [STATUPDATE](#copy-statupdate) <a name="copy-data-load-parameters"></a>Parameters

COMPROWS *numrows*   <a name="copy-comprows"></a>
Specifies the number of rows to be used as the sample size for compression analysis. The analysis is run on rows from each data slice. For example, if you specify `COMPROWS 1000000` (1,000,000) and the system contains four total slices, no more than 250,000 rows for each slice are read and analyzed.  
If COMPROWS isn't specified, the sample size defaults to 100,000 for each slice. Values of COMPROWS lower than the default of 100,000 rows for each slice are automatically upgraded to the default value. However, automatic compression will not take place if the amount of data being loaded is insufficient to produce a meaningful sample.  
If the COMPROWS number is greater than the number of rows in the input file, the COPY command still proceeds and runs the compression analysis on all of the available rows. The accepted range for this argument is a number between 1000 and 2147483647 (2,147,483,647).

COMPUPDATE [ PRESET \$1 \$1 ON \$1 TRUE \$1 \$1 \$1 OFF \$1 FALSE \$1 ]  <a name="copy-compupdate"></a>
Controls whether compression encodings are automatically applied during a COPY.   
When COMPUPDATE is PRESET, the COPY command chooses the compression encoding for each column if the target table is empty; even if the columns already have encodings other than RAW. Currently specified column encodings can be replaced. Encoding for each column is based on the column data type. No data is sampled. Amazon Redshift automatically assigns compression encoding as follows:  
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR or VARCHAR are assigned LZO compression.
When COMPUPDATE is omitted, the COPY command chooses the compression encoding for each column only if the target table is empty and you have not specified an encoding (other than RAW) for any of the columns. The encoding for each column is determined by Amazon Redshift. No data is sampled.   
When COMPUPDATE is ON (or TRUE), or COMPUPDATE is specified without an option, the COPY command applies automatic compression if the table is empty; even if the table columns already have encodings other than RAW. Currently specified column encodings can be replaced. Encoding for each column is based on an analysis of sample data. For more information, see [Loading tables with automatic compression](c_Loading_tables_auto_compress.md).  
When COMPUPDATE is OFF (or FALSE), automatic compression is disabled. Column encodings aren't changed.  
For information about the system table to analyze compression, see [STL\$1ANALYZE\$1COMPRESSION](r_STL_ANALYZE_COMPRESSION.md). 

IGNOREALLERRORS   <a name="copy-ignoreallerrors"></a>
You can specify this option to ignore all errors that occur during the load operation.   
You can't specify the IGNOREALLERRORS option if you specify the MAXERROR option. You can't specify the IGNOREALLERRORS option for columnar formats including ORC and Parquet.

MAXERROR [AS] *error\$1count*   <a name="copy-maxerror"></a>
If the load returns the *error\$1count* number of errors or greater, the load fails. If the load returns fewer errors, it continues and returns an INFO message that states the number of rows that could not be loaded. Use this parameter to allow loads to continue when certain rows fail to load into the table because of formatting errors or other inconsistencies in the data.   
Set this value to `0` or `1` if you want the load to fail as soon as the first error occurs. The AS keyword is optional. The MAXERROR default value is `0` and the limit is `100000`.  
 The actual number of errors reported might be greater than the specified MAXERROR because of the parallel nature of Amazon Redshift. If any node in the Amazon Redshift cluster detects that MAXERROR has been exceeded, each node reports all of the errors it has encountered.

NOLOAD   <a name="copy-noload"></a>
Checks the validity of the data file without actually loading the data. Use the NOLOAD parameter to make sure that your data file loads without any errors before running the actual data load. Running COPY with the NOLOAD parameter is much faster than loading the data because it only parses the files.

STATUPDATE [ \$1 ON \$1 TRUE \$1 \$1 \$1 OFF \$1 FALSE \$1 ]  <a name="copy-statupdate"></a>
Governs automatic computation and refresh of optimizer statistics at the end of a successful COPY command. By default, if the STATUPDATE parameter isn't used, statistics are updated automatically if the table is initially empty.  
Whenever ingesting data into a nonempty table significantly changes the size of the table, we recommend updating statistics either by running an [ANALYZE](r_ANALYZE.md) command or by using the STATUPDATE ON argument.  
With STATUPDATE ON (or TRUE), statistics are updated automatically regardless of whether the table is initially empty. If STATUPDATE is used, the current user must be either the table owner or a superuser. If STATUPDATE is not specified, only INSERT permission is required.  
With STATUPDATE OFF (or FALSE), statistics are never updated.  
For additional information, see [Analyzing tables](t_Analyzing_tables.md).

# Alphabetical parameter list


The following list provides links to each COPY command parameter description, sorted alphabetically.
+ [ACCEPTANYDATE](copy-parameters-data-conversion.md#copy-acceptanydate)
+ [ACCEPTINVCHARS](copy-parameters-data-conversion.md#copy-acceptinvchars)
+ [ACCESS\$1KEY\$1ID, SECRET\$1ACCESS\$1KEY](copy-parameters-authorization.md#copy-access-key-id-access)
+ [AVRO](copy-parameters-data-format.md#copy-avro)
+ [BLANKSASNULL](copy-parameters-data-conversion.md#copy-blanksasnull)
+ [BZIP2](copy-parameters-file-compression.md#copy-bzip2) 
+ [COMPROWS](copy-parameters-data-load.md#copy-comprows)
+ [COMPUPDATE](copy-parameters-data-load.md#copy-compupdate)
+ [CREDENTIALS](copy-parameters-authorization.md#copy-credentials-cred)
+ [CSV](copy-parameters-data-format.md#copy-csv)
+ [DATEFORMAT](copy-parameters-data-conversion.md#copy-dateformat)
+ [DELIMITER](copy-parameters-data-format.md#copy-delimiter)
+ [EMPTYASNULL](copy-parameters-data-conversion.md#copy-emptyasnull)
+ [ENCODING](copy-parameters-data-conversion.md#copy-encoding)
+ [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted)
+ [ESCAPE](copy-parameters-data-conversion.md#copy-escape)
+ [EXPLICIT_IDS](copy-parameters-data-conversion.md#copy-explicit-ids)
+ [FILLRECORD](copy-parameters-data-conversion.md#copy-fillrecord)
+ [FIXEDWIDTH](copy-parameters-data-format.md#copy-fixedwidth)
+ [FORMAT](copy-parameters-data-format.md#copy-format)
+ [FROM](copy-parameters-data-source-s3.md#copy-parameters-from)
+ [GZIP](copy-parameters-file-compression.md#copy-gzip)
+ [IAM\$1ROLE](copy-parameters-authorization.md#copy-iam-role-iam)
+ [IGNOREALLERRORS](copy-parameters-data-load.md#copy-ignoreallerrors)
+ [IGNOREBLANKLINES](copy-parameters-data-conversion.md#copy-ignoreblanklines)
+ [IGNOREHEADER](copy-parameters-data-conversion.md#copy-ignoreheader)
+ [JSON format for COPY](copy-parameters-data-format.md#copy-json)
+ [LZOP](copy-parameters-file-compression.md#copy-lzop)
+ [MANIFEST](copy-parameters-data-source-s3.md#copy-manifest)
+ [MASTER_SYMMETRIC_KEY](copy-parameters-data-source-s3.md#copy-master-symmetric-key)
+ [MAXERROR](copy-parameters-data-load.md#copy-maxerror)
+ [NOLOAD](copy-parameters-data-load.md#copy-noload)
+ [NULL AS](copy-parameters-data-conversion.md#copy-null-as)
+ [READRATIO](copy-parameters-data-source-dynamodb.md#copy-readratio)
+ [REGION](copy-parameters-data-source-s3.md#copy-region)
+ [REMOVEQUOTES](copy-parameters-data-conversion.md#copy-removequotes)
+ [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec)
+ [SESSION\$1TOKEN](copy-parameters-authorization.md#copy-token)
+ [SHAPEFILE](copy-parameters-data-format.md#copy-shapefile)
+ [SSH](copy-parameters-data-source-ssh.md#copy-ssh)
+ [STATUPDATE](copy-parameters-data-load.md#copy-statupdate)
+ [TIMEFORMAT](copy-parameters-data-conversion.md#copy-timeformat)
+ [TRIMBLANKS](copy-parameters-data-conversion.md#copy-trimblanks)
+ [TRUNCATECOLUMNS](copy-parameters-data-conversion.md#copy-truncatecolumns)
+ [ZSTD](copy-parameters-file-compression.md#copy-zstd)

# Usage notes
Usage notes

**Topics**
+ [

# Permissions to access other Amazon Resources
](copy-usage_notes-access-permissions.md)
+ [

# Using COPY with Amazon S3 access point aliases
](copy-usage_notes-s3-access-point-alias.md)
+ [

# Loading multibyte data from Amazon S3
](copy-usage_notes-multi-byte.md)
+ [

# Loading a column of the GEOMETRY or GEOGRAPHY data type
](copy-usage_notes-spatial-data.md)
+ [

# Loading the HLLSKETCH data type
](copy-usage_notes-hll.md)
+ [

# Loading a column of the VARBYTE data type
](copy-usage-varbyte.md)
+ [

# Errors when reading multiple files
](copy-usage_notes-multiple-files.md)
+ [

# COPY from JSON format
](copy-usage_notes-copy-from-json.md)
+ [

# COPY from columnar data formats
](copy-usage_notes-copy-from-columnar.md)
+ [

# DATEFORMAT and TIMEFORMAT strings
](r_DATEFORMAT_and_TIMEFORMAT_strings.md)
+ [

# Using automatic recognition with DATEFORMAT and TIMEFORMAT
](automatic-recognition.md)

# Permissions to access other Amazon Resources
Access permissions

 To move data between your cluster and another Amazon resource, such as Amazon S3, Amazon DynamoDB, Amazon EMR, or Amazon EC2, your cluster must have permission to access the resource and perform the necessary actions. For example, to load data from Amazon S3, COPY must have LIST access to the bucket and GET access for the bucket objects. For information about minimum permissions, see [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](#copy-usage_notes-iam-permissions).

To get authorization to access the resource, your cluster must be authenticated. You can choose either of the following authentication methods: 
+ [Role-based access control](#copy-usage_notes-access-role-based) – For role-based access control, you specify an Amazon Identity and Access Management (IAM) role that your cluster uses for authentication and authorization. To safeguard your Amazon credentials and sensitive data, we strongly recommend using role-based authentication.
+ [Key-based access control](#copy-usage_notes-access-key-based) – For key-based access control, you provide the Amazon access credentials (access key ID and secret access key) for a user as plain text.

## Role-based access control


With <a name="copy-usage_notes-access-role-based.phrase"></a>role-based access control, your cluster temporarily assumes an IAM role on your behalf. Then, based on the authorizations granted to the role, your cluster can access the required Amazon resources.

Creating an IAM *role* is similar to granting permissions to a user, in that it is an Amazon identity with permissions policies that determine what the identity can and can't do in Amazon. However, instead of being uniquely associated with one user, a role can be assumed by any entity that needs it. Also, a role doesn’t have any credentials (a password or access keys) associated with it. Instead, if a role is associated with a cluster, access keys are created dynamically and provided to the cluster.

We recommend using role-based access control because it provides more secure, fine-grained control of access to Amazon resources and sensitive user data, in addition to safeguarding your Amazon credentials.

Role-based authentication delivers the following benefits:
+ You can use Amazon standard IAM tools to define an IAM role and associate the role with multiple clusters. When you modify the access policy for a role, the changes are applied automatically to all clusters that use the role.
+ You can define fine-grained IAM policies that grant permissions for specific clusters and database users to access specific Amazon resources and actions.
+ Your cluster obtains temporary session credentials at run time and refreshes the credentials as needed until the operation completes. If you use key-based temporary credentials, the operation fails if the temporary credentials expire before it completes.
+ Your access key ID and secret access key ID aren't stored or transmitted in your SQL code.

To use role-based access control, you must first create an IAM role using the Amazon Redshift service role type, and then attach the role to your cluster. The role must have, at a minimum, the permissions listed in [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](#copy-usage_notes-iam-permissions). For steps to create an IAM role and attach it to your cluster, see [Authorizing Amazon Redshift to Access Other Amazon Services On Your Behalf](https://docs.amazonaws.cn/redshift/latest/mgmt/authorizing-redshift-service.html) in the *Amazon Redshift Management Guide*.

You can add a role to a cluster or view the roles associated with a cluster by using the Amazon Redshift Management Console, CLI, or API. For more information, see [Associating an IAM Role With a Cluster](https://docs.amazonaws.cn/redshift/latest/mgmt/copy-unload-iam-role.html) in the *Amazon Redshift Management Guide*.

When you create an IAM role, IAM returns an Amazon Resource Name (ARN) for the role. To specify an IAM role, provide the role ARN with either the [Using the IAM\$1ROLE parameter](copy-parameters-authorization.md#copy-iam-role) parameter or the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter. 

For example, suppose the following role is attached to the cluster.

```
"IamRoleArn": "arn:aws:iam::0123456789012:role/MyRedshiftRole"
```

The following COPY command example uses the IAM\$1ROLE parameter with the ARN in the previous example for authentication and access to Amazon S3.

```
copy customer from 's3://amzn-s3-demo-bucket/mydata'  
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The following COPY command example uses the CREDENTIALS parameter to specify the IAM role.

```
copy customer from 's3://amzn-s3-demo-bucket/mydata' 
credentials 
'aws_iam_role=arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

In addition, a superuser can grant the ASSUMEROLE privilege to database users and groups to provide access to a role for COPY operations. For information, see [GRANT](r_GRANT.md).

## Key-based access control


With <a name="copy-usage_notes-access-key-based.phrase"></a>key-based access control, you provide the access key ID and secret access key for an IAM user that is authorized to access the Amazon resources that contain the data. You can use either the [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](copy-parameters-authorization.md#copy-access-key-id) parameters together or the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter.

**Note**  
We strongly recommend using an IAM role for authentication instead of supplying a plain-text access key ID and secret access key. If you choose key-based access control, never use your Amazon account (root) credentials. Always create an IAM user and provide that user's access key ID and secret access key. For steps to create an IAM user, see [Creating an IAM User in Your Amazon Account](https://docs.amazonaws.cn/IAM/latest/UserGuide/id_users_create.html).

To authenticate using ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY, replace *<access-key-id>* and *<secret-access-key>* with an authorized user's access key ID and full secret access key as shown following. 

```
ACCESS_KEY_ID '<access-key-id>'
SECRET_ACCESS_KEY '<secret-access-key>';
```

To authenticate using the CREDENTIALS parameter, replace *<access-key-id>* and *<secret-access-key>* with an authorized user's access key ID and full secret access key as shown following.

```
CREDENTIALS
'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>';
```

The IAM user must have, at a minimum, the permissions listed in [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](#copy-usage_notes-iam-permissions).

### Temporary security credentials
Temporary security credentials

 If you are using key-based access control, you can further limit the access users have to your data by using temporary security credentials. Role-based authentication automatically uses temporary credentials. 

**Note**  
We strongly recommend using [role-based access control](#copy-usage_notes-access-role-based.phrase) instead of creating temporary credentials and providing access key ID and secret access key as plain text. Role-based access control automatically uses temporary credentials. 

Temporary security credentials provide enhanced security because they have short lifespans and can't be reused after they expire. The access key ID and secret access key generated with the token can't be used without the token, and a user who has these temporary security credentials can access your resources only until the credentials expire.

To grant users temporary access to your resources, you call Amazon Security Token Service (Amazon STS) API operations. The Amazon STS API operations return temporary security credentials consisting of a security token, an access key ID, and a secret access key. You issue the temporary security credentials to the users who need temporary access to your resources. These users can be existing IAM users, or they can be non-Amazon users. For more information about creating temporary security credentials, see [Using Temporary Security Credentials](https://docs.amazonaws.cn/STS/latest/UsingSTS/Welcome.html) in the IAM User Guide.

You can use either the [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](copy-parameters-authorization.md#copy-access-key-id) parameters together with the [SESSION\$1TOKEN](copy-parameters-authorization.md#copy-token) parameter or the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter. You must also supply the access key ID and secret access key that were provided with the token.

To authenticate using ACCESS\$1KEY\$1ID, SECRET\$1ACCESS\$1KEY, and SESSION\$1TOKEN, replace *<temporary-access-key-id>*, *<temporary-secret-access-key>*, and *<temporary-token>* as shown following. 

```
ACCESS_KEY_ID '<temporary-access-key-id>'
SECRET_ACCESS_KEY '<temporary-secret-access-key>'
SESSION_TOKEN '<temporary-token>';
```

To authenticate using CREDENTIALS, include `session_token=<temporary-token>` in the credentials string as shown following. 

```
CREDENTIALS
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key>;session_token=<temporary-token>';
```

The following example shows a COPY command with temporary security credentials.

```
copy table-name
from 's3://objectpath'
access_key_id '<temporary-access-key-id>'
secret_access_key '<temporary-secret-access-key>'
session_token '<temporary-token>';
```

The following example loads the LISTING table with temporary credentials and file encryption.

```
copy listing
from 's3://amzn-s3-demo-bucket/data/listings_pipe.txt'
access_key_id '<temporary-access-key-id>'
secret_access_key '<temporary-secret-access-key>'
session_token '<temporary-token>'
master_symmetric_key '<root-key>'
encrypted;
```

The following example loads the LISTING table using the CREDENTIALS parameter with temporary credentials and file encryption.

```
copy listing
from 's3://amzn-s3-demo-bucket/data/listings_pipe.txt'
credentials 
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key>;session_token=<temporary-token>;master_symmetric_key=<root-key>'
encrypted;
```

**Important**  
The temporary security credentials must be valid for the entire duration of the COPY or UNLOAD operation. If the temporary security credentials expire during the operation, the command fails and the transaction is rolled back. For example, if temporary security credentials expire after 15 minutes and the COPY operation requires one hour, the COPY operation fails before it completes. If you use role-based access, the temporary security credentials are automatically refreshed until the operation completes.

## IAM permissions for COPY, UNLOAD, and CREATE LIBRARY
IAM permissions

The IAM role or user referenced by the CREDENTIALS parameter must have, at a minimum, the following permissions:
+ For COPY from Amazon S3, permission to LIST the Amazon S3 bucket and GET the Amazon S3 objects that are being loaded, and the manifest file, if one is used.
+ For COPY from Amazon S3, Amazon EMR, and remote hosts (SSH) with JSON-formatted data, permission to LIST and GET the JSONPaths file on Amazon S3, if one is used. 
+ For COPY from DynamoDB, permission to SCAN and DESCRIBE the DynamoDB table that is being loaded. 
+ For COPY from an Amazon EMR cluster, permission for the `ListInstances` action on the Amazon EMR cluster. 
+ For UNLOAD to Amazon S3, GET, LIST, and PUT permissions for the Amazon S3 bucket to which the data files are being unloaded.
+ For CREATE LIBRARY from Amazon S3, permission to LIST the Amazon S3 bucket and GET the Amazon S3 objects being imported.

**Note**  
If you receive the error message `S3ServiceException: Access Denied`, when running a COPY, UNLOAD, or CREATE LIBRARY command, your cluster doesn’t have proper access permissions for Amazon S3.

You can manage IAM permissions by attaching an IAM policy to an IAM role that is attached to your cluster, to a user, or to the group to which your user belongs. For example, the `AmazonS3ReadOnlyAccess` managed policy grants LIST and GET permissions to Amazon S3 resources. For more information about IAM policies, see [Managing IAM Policies](https://docs.amazonaws.cn/IAM/latest/UserGuide/access_policies_manage.html) in the *IAM User Guide*. 

# Using COPY with Amazon S3 access point aliases


COPY supports Amazon S3 access point aliases. For more information, see [Using a bucket–style alias for your access point](https://docs.amazonaws.cn/AmazonS3/latest/userguide/access-points-alias.html) in the *Amazon Simple Storage Service User Guide*.

# Loading multibyte data from Amazon S3


If your data includes non-ASCII multibyte characters (such as Chinese or Cyrillic characters), you must load the data to VARCHAR columns. The VARCHAR data type supports four-byte UTF-8 characters, but the CHAR data type only accepts single-byte ASCII characters. You can't load five-byte or longer characters into Amazon Redshift tables. For more information, see [Multibyte characters](c_Supported_data_types.md#c_Supported_data_types-multi-byte-characters). 

# Loading a column of the GEOMETRY or GEOGRAPHY data type


You can COPY to `GEOMETRY` or `GEOGRAPHY` columns from data in a character-delimited text file, such as a CSV file. The data must be in the hexadecimal form of the well-known binary format (either WKB or EWKB) or the well-known text format (either WKT or EWKT) and fit within the maximum size of a single input row to the COPY command. For more information, see [COPY](r_COPY.md). 

For information about how to load from a shapefile, see [Loading a shapefile into Amazon Redshift](spatial-copy-shapefile.md).

For more information about the `GEOMETRY` or `GEOGRAPHY` data type, see [Querying spatial data in Amazon Redshift](geospatial-overview.md).

# Loading the HLLSKETCH data type


You can copy HLL sketches only in sparse or dense format supported by Amazon Redshift. To use the COPY command on HyperLogLog sketches, use the Base64 format for dense HyperLogLog sketches and the JSON format for sparse HyperLogLog sketches. For more information, see [HyperLogLog functions](hyperloglog-functions.md). 

The following example imports data from a CSV file into a table using CREATE TABLE and COPY. First, the example creates the table `t1` using CREATE TABLE.

```
CREATE TABLE t1 (sketch hllsketch, a bigint);
```

Then it uses COPY to import data from a CSV file into the table `t1`. 

```
COPY t1 FROM s3://amzn-s3-demo-bucket/unload/' IAM_ROLE 'arn:aws:iam::0123456789012:role/MyRedshiftRole' NULL AS 'null' CSV;
```

# Loading a column of the VARBYTE data type


You can load data from a file in CSV, Parquet, and ORC format. For CSV, the data is loaded from a file in hexadecimal representation of the VARBYTE data. You can't load VARBYTE data with the `FIXEDWIDTH` option. The `ADDQUOTES` or `REMOVEQUOTES` option of COPY is not supported. A VARBYTE column can't be used as a partition column. 

# Errors when reading multiple files


The COPY command is atomic and transactional. In other words, even when the COPY command reads data from multiple files, the entire process is treated as a single transaction. If COPY encounters an error reading a file, it automatically retries until the process times out (see [statement\$1timeout](r_statement_timeout.md)) or if data can't be download from Amazon S3 for a prolonged period of time (between 15 and 30 minutes), ensuring that each file is loaded only once. If the COPY command fails, the entire transaction is canceled and all changes are rolled back. For more information about handling load errors, see [Troubleshooting data loads](t_Troubleshooting_load_errors.md). 

After a COPY command is successfully initiated, it doesn't fail if the session terminates, for example when the client disconnects. However, if the COPY command is within a BEGIN … END transaction block that doesn't complete because the session terminates, the entire transaction, including the COPY, is rolled back. For more information about transactions, see [BEGIN](r_BEGIN.md).

# COPY from JSON format
COPY from JSON

The JSON data structure is made up of a set of objects or arrays. A JSON *object* begins and ends with braces, and contains an unordered collection of name-value pairs. Each name and value are separated by a colon, and the pairs are separated by commas. The name is a string in double quotation marks. The quotation mark characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks. 

A JSON *array* begins and ends with brackets, and contains an ordered collection of values separated by commas. A value can be a string in double quotation marks, a number, a Boolean true or false, null, a JSON object, or an array. 

JSON objects and arrays can be nested, enabling a hierarchical data structure. The following example shows a JSON data structure with two valid objects. 

```
{
    "id": 1006410,
    "title": "Amazon Redshift Database Developer Guide"
}
{
    "id": 100540,
    "name": "Amazon Simple Storage Service User Guide"
}
```

The following shows the same data as two JSON arrays.

```
[
    1006410,
    "Amazon Redshift Database Developer Guide"
]
[
    100540,
    "Amazon Simple Storage Service User Guide"
]
```

## COPY options for JSON


You can specify the following options when using COPY with JSON format data: 
+ `'auto' `– COPY automatically loads fields from the JSON file. 
+ `'auto ignorecase'` – COPY automatically loads fields from the JSON file while ignoring the case of field names.
+ `s3://jsonpaths_file` – COPY uses a JSONPaths file to parse the JSON source data. A *JSONPaths file* is a text file that contains a single JSON object with the name `"jsonpaths"` paired with an array of JSONPath expressions. If the name is any string other than `"jsonpaths"`, COPY uses the `'auto'` argument instead of using the JSONPaths file. 

For examples that show how to load data using `'auto'`, `'auto ignorecase'`, or a JSONPaths file, and using either JSON objects or arrays, see [Copy from JSON examples](r_COPY_command_examples.md#r_COPY_command_examples-copy-from-json). 

## JSONPath option


In the Amazon Redshift COPY syntax, a JSONPath expression specifies the explicit path to a single name element in a JSON hierarchical data structure, using either bracket notation or dot notation. Amazon Redshift doesn't support any JSONPath elements, such as wildcard characters or filter expressions, that might resolve to an ambiguous path or multiple name elements. As a result, Amazon Redshift can't parse complex, multi-level data structures.

The following is an example of a JSONPaths file with JSONPath expressions using bracket notation. The dollar sign (\$1) represents the root-level structure. 

```
{
    "jsonpaths": [
       "$['id']",
       "$['store']['book']['title']",
	"$['location'][0]" 
    ]
}
```

 In the previous example, `$['location'][0]` references the first element in an array. JSON uses zero-based array indexing. Array indexes must be positive integers (greater than or equal to zero).

The following example shows the previous JSONPaths file using dot notation. 

```
{
    "jsonpaths": [
       "$.id",
       "$.store.book.title",
	"$.location[0]"
    ]
}
```

You can't mix bracket notation and dot notation in the `jsonpaths` array. Brackets can be used in both bracket notation and dot notation to reference an array element. 

When using dot notation, the JSONPath expressions can't contain the following characters: 
+ Single straight quotation mark ( ' ) 
+ Period, or dot ( . ) 
+ Brackets ( [ ] ) unless used to reference an array element 

If the value in the name-value pair referenced by a JSONPath expression is an object or an array, the entire object or array is loaded as a string, including the braces or brackets. For example, suppose that your JSON data contains the following object. 

```
{
    "id": 0,
    "guid": "84512477-fa49-456b-b407-581d0d851c3c",
    "isActive": true,
    "tags": [
        "nisi",
        "culpa",
        "ad",
        "amet",
        "voluptate",
        "reprehenderit",
        "veniam"
    ],
    "friends": [
        {
            "id": 0,
            "name": "Martha Rivera"
        },
        {
            "id": 1,
            "name": "Renaldo"
        }
    ]
}
```

The JSONPath expression `$['tags']` then returns the following value. 

```
"["nisi","culpa","ad","amet","voluptate","reprehenderit","veniam"]" 
```

The JSONPath expression `$['friends'][1]` then returns the following value. 

```
"{"id": 1,"name": "Renaldo"}" 
```

Each JSONPath expression in the `jsonpaths` array corresponds to one column in the Amazon Redshift target table. The order of the `jsonpaths` array elements must match the order of the columns in the target table or the column list, if a column list is used. 

For examples that show how to load data using either the `'auto'` argument or a JSONPaths file, and using either JSON objects or arrays, see [Copy from JSON examples](r_COPY_command_examples.md#r_COPY_command_examples-copy-from-json). 

For information on how to copy multiple JSON files, see [Using a manifest to specify data files](loading-data-files-using-manifest.md).

## Escape characters in JSON


COPY loads `\n` as a newline character and loads `\t` as a tab character. To load a backslash, escape it with a backslash ( `\\` ).

For example, suppose you have the following JSON in a file named `escape.json` in the bucket `s3://amzn-s3-demo-bucket/json/`.

```
{
  "backslash": "This is a backslash: \\",
  "newline": "This sentence\n is on two lines.",
  "tab": "This sentence \t contains a tab."
}
```

Run the following commands to create the ESCAPES table and load the JSON.

```
create table escapes (backslash varchar(25), newline varchar(35), tab varchar(35));

copy escapes from 's3://amzn-s3-demo-bucket/json/escape.json' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as json 'auto';
```

Query the ESCAPES table to view the results.

```
select * from escapes;

       backslash        |      newline      |               tab
------------------------+-------------------+----------------------------------
 This is a backslash: \ | This sentence     | This sentence    contains a tab.
                        :  is on two lines.
(1 row)
```

## Loss of numeric precision


You might lose precision when loading numbers from data files in JSON format to a column that is defined as a numeric data type. Some floating point values aren't represented exactly in computer systems. As a result, data you copy from a JSON file might not be rounded as you expect. To avoid a loss of precision, we recommend using one of the following alternatives:
+ Represent the number as a string by enclosing the value in double quotation characters.
+ Use [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec) to round the number instead of truncating.
+ Instead of using JSON or Avro files, use CSV, character-delimited, or fixed-width text files.

# COPY from columnar data formats


COPY can load data from Amazon S3 in the following columnar formats:
+ ORC
+ Parquet

For examples of using COPY from columnar data formats, see [COPY examples](r_COPY_command_examples.md).

COPY supports columnar formatted data with the following considerations:
+ The Amazon S3 bucket must be in the same Amazon Region as the Amazon Redshift database. 
+ To access your Amazon S3 data through a VPC endpoint, set up access using IAM policies and IAM roles as described in [Using Amazon Redshift Spectrum with Enhanced VPC Routing](https://docs.amazonaws.cn/redshift/latest/mgmt/spectrum-enhanced-vpc.html) in the *Amazon Redshift Management Guide*. 
+ COPY doesn't automatically apply compression encodings. 
+ Only the following COPY parameters are supported: 
  + [ACCEPTINVCHARS](copy-parameters-data-conversion.md#copy-acceptinvchars) when copying from an ORC or Parquet file.
  + [FILLRECORD](copy-parameters-data-conversion.md#copy-fillrecord)
  + [FROM](copy-parameters-data-source-s3.md#copy-parameters-from)
  + [IAM\$1ROLE](copy-parameters-authorization.md#copy-iam-role)
  + [CREDENTIALS](copy-parameters-authorization.md#copy-credentials)
  + [STATUPDATE ](copy-parameters-data-load.md#copy-statupdate)
  + [MANIFEST](copy-parameters-data-source-s3.md#copy-manifest)
  + [EXPLICIT\$1IDS](copy-parameters-data-conversion.md#copy-explicit-ids)
+ If COPY encounters an error while loading, the command fails. ACCEPTANYDATE and MAXERROR aren't supported for columnar data types.
+ Error messages are sent to the SQL client. Some errors are logged in STL\$1LOAD\$1ERRORS and STL\$1ERROR.
+ COPY inserts values into the target table's columns in the same order as the columns occur in the columnar data files. The number of columns in the target table and the number of columns in the data file must match.
+ If the file you specify for the COPY operation includes one of the following extensions, we decompress the data without the need for adding any parameters: 
  + `.gz`
  + `.snappy`
  + `.bz2`
+ COPY from the Parquet and ORC file formats uses Redshift Spectrum and the bucket access. To use COPY for these formats, be sure there are no IAM policies blocking the use of Amazon S3 presigned URLs. The presigned URLs generated by Amazon Redshift are valid for 1 hour so that Amazon Redshift has enough time to load all the files from the Amazon S3 bucket. A unique presigned URL is generated for each file scanned by COPY from columnar data formats. For bucket policies that include an `s3:signatureAge` action, make sure to set the value to at least 3,600,000 milliseconds. For more information, see [Using Amazon Redshift Spectrum with enhanced VPC routing](https://docs.amazonaws.cn/redshift/latest/mgmt/spectrum-enhanced-vpc.html).
+ The REGION parameter is not supported with COPY from columnar data formats. Even if your Amazon S3 bucket and your database are in the same Amazon Web Services Region, you can encounter an error, such as, REGION argument is not supported for PARQUET based COPY.
+ COPY from columnar formats now support concurrency scaling. To enable concurrency scaling, see [Configuring concurrency scaling queues](https://docs.amazonaws.cn/redshift/latest/dg/concurrency-scaling.html#concurrency-scaling-queues).

# DATEFORMAT and TIMEFORMAT strings
DATEFORMAT and TIMEFORMAT strings

The COPY command uses the DATEFORMAT and TIMEFORMAT options to parse date and time values in your source data. DATEFORMAT and TIMEFORMAT are formatted strings that must match the format of your source data's date and time values. For example, a COPY command loading source data with the date value `Jan-01-1999` must include the following DATEFORMAT string:

```
COPY ...
            DATEFORMAT AS 'MON-DD-YYYY'
```

For more information on managing COPY data conversions, see [Data conversion parameters](https://docs.amazonaws.cn/redshift/latest/dg/copy-parameters-data-conversion.html). 

DATEFORMAT and TIMEFORMAT strings can contain datetime separators (such as '`-`', '`/`', or '`:`'), as well the datepart and timepart formats in the following table.

**Note**  
If you can't match the format of your date or time values with the following dateparts and timeparts, or if you have date and time values that use formats different from each other, use the `'auto'` argument with the DATEFORMAT or TIMEFORMAT parameter. The `'auto'` argument recognizes several formats that aren't supported when using a DATEFORMAT or TIMEFORMAT string. For more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md).

[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/r_DATEFORMAT_and_TIMEFORMAT_strings.html)

The default date format is YYYY-MM-DD. The default timestamp without time zone (TIMESTAMP) format is YYYY-MM-DD HH:MI:SS. The default timestamp with time zone (TIMESTAMPTZ) format is YYYY-MM-DD HH:MI:SSOF, where OF is the offset from UTC (for example, -8:00. You can't include a time zone specifier (TZ, tz, or OF) in the timeformat\$1string. The seconds (SS) field also supports fractional seconds up to a microsecond level of detail. To load TIMESTAMPTZ data that is in a format different from the default format, specify 'auto'.

Following are some sample dates or times you can encounter in your source data, and the corresponding DATEFORMAT or TIMEFORMAT strings for them.

[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/r_DATEFORMAT_and_TIMEFORMAT_strings.html)

## Example


For an example of using TIMEFORMAT, see [Load a timestamp or datestamp](r_COPY_command_examples.md#r_COPY_command_examples-load-a-time-datestamp).

# Using automatic recognition with DATEFORMAT and TIMEFORMAT


If you specify `'auto'` as the argument for the DATEFORMAT or TIMEFORMAT parameter, Amazon Redshift will automatically recognize and convert the date format or time format in your source data. The following shows an example.

```
copy favoritemovies from 'dynamodb://ProductCatalog' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
dateformat 'auto';
```

When used with the `'auto'` argument for DATEFORMAT and TIMEFORMAT, COPY recognizes and converts the date and time formats listed in the table in [DATEFORMAT and TIMEFORMAT stringsExample](r_DATEFORMAT_and_TIMEFORMAT_strings.md). In addition, the `'auto'` argument recognizes the following formats that aren't supported when using a DATEFORMAT and TIMEFORMAT string.

[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/automatic-recognition.html)

Automatic recognition doesn't support epochsecs and epochmillisecs.

To test whether a date or timestamp value will be automatically converted, use a CAST function to attempt to convert the string to a date or timestamp value. For example, the following commands test the timestamp value `'J2345678 04:05:06.789'`:

```
create table formattest (test char(21));
insert into formattest values('J2345678 04:05:06.789');
select test, cast(test as timestamp) as timestamp, cast(test as date) as date from formattest;

        test          |      timestamp      |	date
----------------------+---------------------+------------
J2345678 04:05:06.789   1710-02-23 04:05:06	1710-02-23
```

If the source data for a DATE column includes time information, the time component is truncated. If the source data for a TIMESTAMP column omits time information, 00:00:00 is used for the time component.

# COPY examples
COPY examples

**Note**  
These examples contain line breaks for readability. Do not include line breaks or spaces in your *credentials-args* string.

**Topics**
+ [

## Load FAVORITEMOVIES from an DynamoDB table
](#r_COPY_command_examples-load-favoritemovies-from-an-amazon-dynamodb-table)
+ [

## Load LISTING from an Amazon S3 bucket
](#r_COPY_command_examples-load-listing-from-an-amazon-s3-bucket)
+ [

## Load LISTING from an Amazon EMR cluster
](#copy-command-examples-emr)
+ [Example: COPY from Amazon S3 using a manifest](#copy-command-examples-manifest)
+ [

## Load LISTING from a pipe-delimited file (default delimiter)
](#r_COPY_command_examples-load-listing-from-a-pipe-delimited-file-default-delimiter)
+ [

## Load LISTING using columnar data in Parquet format
](#r_COPY_command_examples-load-listing-from-parquet)
+ [

## Load LISTING using columnar data in ORC format
](#r_COPY_command_examples-load-listing-from-orc)
+ [

## Load EVENT with options
](#r_COPY_command_examples-load-event-with-options)
+ [

## Load VENUE from a fixed-width data file
](#r_COPY_command_examples-load-venue-from-a-fixed-width-data-file)
+ [

## Load CATEGORY from a CSV file
](#load-from-csv)
+ [

## Load VENUE with explicit values for an IDENTITY column
](#r_COPY_command_examples-load-venue-with-explicit-values-for-an-identity-column)
+ [

## Load TIME from a pipe-delimited GZIP file
](#r_COPY_command_examples-load-time-from-a-pipe-delimited-gzip-file)
+ [

## Load a timestamp or datestamp
](#r_COPY_command_examples-load-a-time-datestamp)
+ [

## Load data from a file with default values
](#r_COPY_command_examples-load-data-from-a-file-with-default-values)
+ [

## COPY data with the ESCAPE option
](#r_COPY_command_examples-copy-data-with-the-escape-option)
+ [

## Copy from JSON examples
](#r_COPY_command_examples-copy-from-json)
+ [

## Copy from Avro examples
](#r_COPY_command_examples-copy-from-avro)
+ [

## Preparing files for COPY with the ESCAPE option
](#r_COPY_preparing_data)
+ [

## Loading a shapefile into Amazon Redshift
](#copy-example-spatial-copy-shapefile)
+ [

## COPY command with the NOLOAD option
](#r_COPY_command_examples-load-noload-option)
+ [

## COPY command with a multibyte delimiter and the ENCODING option
](#r_COPY_command_examples-load-encoding-multibyte-delimiter-option)

## Load FAVORITEMOVIES from an DynamoDB table
Load FAVORITEMOVIES from an DynamoDB table

The Amazon SDKs include a simple example of creating a DynamoDB table called *Movies*. (For this example, see [Getting Started with DynamoDB](https://docs.amazonaws.cn/amazondynamodb/latest/developerguide/GettingStarted.html).) The following example loads the Amazon Redshift MOVIES table with data from the DynamoDB table. The Amazon Redshift table must already exist in the database.

```
copy favoritemovies from 'dynamodb://Movies'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
readratio 50;
```

## Load LISTING from an Amazon S3 bucket
Load LISTING from an Amazon S3 Bucket

The following example loads LISTING from an Amazon S3 bucket. The COPY command loads all of the files in the `/data/listing/` folder.

```
copy listing
from 's3://amzn-s3-demo-bucket/data/listing/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

## Load LISTING from an Amazon EMR cluster


The following example loads the SALES table with tab-delimited data from lzop-compressed files in an Amazon EMR cluster. COPY loads every file in the `myoutput/` folder that begins with `part-`.

```
copy sales
from 'emr://j-SAMPLE2B500FC/myoutput/part-*' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '\t' lzop;
```

The following example loads the SALES table with JSON formatted data in an Amazon EMR cluster. COPY loads every file in the `myoutput/json/` folder.

```
copy sales
from 'emr://j-SAMPLE2B500FC/myoutput/json/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
JSON 's3://amzn-s3-demo-bucket/jsonpaths.txt';
```

## Using a manifest to specify data files


You can use a manifest to ensure that your COPY command loads all of the required files, and only the required files, from Amazon S3. You can also use a manifest when you need to load multiple files from different buckets or files that don't share the same prefix. 

For example, suppose that you need to load the following three files: `custdata1.txt`, `custdata2.txt`, and `custdata3.txt`. You could use the following command to load all of the files in `amzn-s3-demo-bucket` that begin with `custdata` by specifying a prefix: 

```
copy category
from 's3://amzn-s3-demo-bucket/custdata' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

If only two of the files exist because of an error, COPY loads only those two files and finishes successfully, resulting in an incomplete data load. If the bucket also contains an unwanted file that happens to use the same prefix, such as a file named `custdata.backup` for example, COPY loads that file as well, resulting in unwanted data being loaded.

To ensure that all of the required files are loaded and to prevent unwanted files from being loaded, you can use a manifest file. The manifest is a JSON-formatted text file that lists the files to be processed by the COPY command. For example, the following manifest loads the three files in the previous example.

```
{  
   "entries":[  
      {  
         "url":"s3://amzn-s3-demo-bucket/custdata.1",
         "mandatory":true
      },
      {  
         "url":"s3://amzn-s3-demo-bucket/custdata.2",
         "mandatory":true
      },
      {  
         "url":"s3://amzn-s3-demo-bucket/custdata.3",
         "mandatory":true
      }
   ]
}
```

The optional `mandatory` flag indicates whether COPY should terminate if the file doesn't exist. The default is `false`. Regardless of any mandatory settings, COPY terminates if no files are found. In this example, COPY returns an error if any of the files isn't found. Unwanted files that might have been picked up if you specified only a key prefix, such as `custdata.backup`, are ignored, because they aren't on the manifest. 

When loading from data files in ORC or Parquet format, a `meta` field is required, as shown in the following example.

```
{  
   "entries":[  
      {  
         "url":"s3://amzn-s3-demo-bucket1/orc/2013-10-04-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      },
      {  
         "url":"s3://amzn-s3-demo-bucket2/orc/2013-10-05-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      }
   ]
}
```

The following example uses a manifest named `cust.manifest`. 

```
copy customer
from 's3://amzn-s3-demo-bucket/cust.manifest' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as orc
manifest;
```

You can use a manifest to load files from different buckets or files that don't share the same prefix. The following example shows the JSON to load data with files whose names begin with a date stamp.

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/2013-10-04-custdata.txt","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket/2013-10-05-custdata.txt","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket/2013-10-06-custdata.txt","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket/2013-10-07-custdata.txt","mandatory":true}
  ]
}
```

The manifest can list files that are in different buckets, as long as the buckets are in the same Amazon Region as the cluster. 

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket1/custdata1.txt","mandatory":false},
    {"url":"s3://amzn-s3-demo-bucket2/custdata1.txt","mandatory":false},
    {"url":"s3://amzn-s3-demo-bucket2/custdata2.txt","mandatory":false}
  ]
}
```

## Load LISTING from a pipe-delimited file (default delimiter)
Load LISTING from a pipe-delimited file (default delimiter)

The following example is a very simple case in which no options are specified and the input file contains the default delimiter, a pipe character ('\$1'). 

```
copy listing 
from 's3://amzn-s3-demo-bucket/data/listings_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

## Load LISTING using columnar data in Parquet format
Load LISTING using columnar data in Parquet format

The following example loads data from a folder on Amazon S3 named parquet. 

```
copy listing 
from 's3://amzn-s3-demo-bucket/data/listings/parquet/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as parquet;
```

## Load LISTING using columnar data in ORC format
Load LISTING using columnar data in ORC format

The following example loads data from a folder on Amazon S3 named `orc`. 

```
copy listing 
from 's3://amzn-s3-demo-bucket/data/listings/orc/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as orc;
```

## Load EVENT with options
Load EVENT with options

The following example loads pipe-delimited data into the EVENT table and applies the following rules: 
+ If pairs of quotation marks are used to surround any character strings, they are removed.
+ Both empty strings and strings that contain blanks are loaded as NULL values.
+ The load fails if more than 5 errors are returned.
+ Timestamp values must comply with the specified format; for example, a valid timestamp is `2008-09-26 05:43:12`.

```
copy event
from 's3://amzn-s3-demo-bucket/data/allevents_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
removequotes
emptyasnull
blanksasnull
maxerror 5
delimiter '|'
timeformat 'YYYY-MM-DD HH:MI:SS';
```

## Load VENUE from a fixed-width data file
Load VENUE from a fixed-width data file

```
copy venue
from 's3://amzn-s3-demo-bucket/data/venue_fw.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
fixedwidth 'venueid:3,venuename:25,venuecity:12,venuestate:2,venueseats:6';
```

The preceding example assumes a data file formatted in the same way as the sample data shown. In the sample following, spaces act as placeholders so that all of the columns are the same width as noted in the specification: 

```
1  Toyota Park              Bridgeview  IL0
2  Columbus Crew Stadium    Columbus    OH0
3  RFK Stadium              Washington  DC0
4  CommunityAmerica BallparkKansas City KS0
5  Gillette Stadium         Foxborough  MA68756
```

## Load CATEGORY from a CSV file


Suppose you want to load the CATEGORY with the values shown in the following table.

[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/r_COPY_command_examples.html)

The following example shows the contents of a text file with the field values separated by commas.

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,All "non-musical" theatre  
14,Shows,Opera,All opera, light, and "rock" opera
15,Concerts,Classical,All symphony, concerto, and choir concerts
```

If you load the file using the DELIMITER parameter to specify comma-delimited input, the COPY command fails because some input fields contain commas. You can avoid that problem by using the CSV parameter and enclosing the fields that contain commas in quotation mark characters. If the quotation mark character appears within a quoted string, you need to escape it by doubling the quotation mark character. The default quotation mark character is a double quotation mark, so you need to escape each double quotation mark with an additional double quotation mark. Your new input file looks something like this. 

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,"All ""non-musical"" theatre"
14,Shows,Opera,"All opera, light, and ""rock"" opera"
15,Concerts,Classical,"All symphony, concerto, and choir concerts"
```

Assuming the file name is `category_csv.txt`, you can load the file by using the following COPY command:

```
copy category
from 's3://amzn-s3-demo-bucket/data/category_csv.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
csv;
```

Alternatively, to avoid the need to escape the double quotation marks in your input, you can specify a different quotation mark character by using the QUOTE AS parameter. For example, the following version of `category_csv.txt` uses '`%`' as the quotation mark character.

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,%All "non-musical" theatre%
14,Shows,Opera,%All opera, light, and "rock" opera%
15,Concerts,Classical,%All symphony, concerto, and choir concerts%
```

The following COPY command uses QUOTE AS to load `category_csv.txt`:

```
copy category
from 's3://amzn-s3-demo-bucket/data/category_csv.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
csv quote as '%';
```

## Load VENUE with explicit values for an IDENTITY column
Load VENUE with explicit values for an IDENTITY column

The following example assumes that when the VENUE table was created that at least one column (such as the `venueid` column) was specified to be an IDENTITY column. This command overrides the default IDENTITY behavior of autogenerating values for an IDENTITY column and instead loads the explicit values from the venue.txt file. Amazon Redshift does not check if duplicate IDENTITY values are loaded into the table when using the EXLICIT\$1IDS option. 

```
copy venue
from 's3://amzn-s3-demo-bucket/data/venue.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
explicit_ids;
```

## Load TIME from a pipe-delimited GZIP file
Load TIME from a pipe-delimited GZIP file

The following example loads the TIME table from a pipe-delimited GZIP file:

```
copy time
from 's3://amzn-s3-demo-bucket/data/timerows.gz' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
gzip
delimiter '|';
```

## Load a timestamp or datestamp
Load a timestamp or datestamp

The following example loads data with a formatted timestamp.

**Note**  
The TIMEFORMAT of `HH:MI:SS` can also support fractional seconds beyond the `SS` to a microsecond level of detail. The file `time.txt` used in this example contains one row, `2009-01-12 14:15:57.119568`.

```
copy timestamp1 
from 's3://amzn-s3-demo-bucket/data/time.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
timeformat 'YYYY-MM-DD HH:MI:SS';
```

The result of this copy is as follows: 

```
select * from timestamp1;
c1
----------------------------
2009-01-12 14:15:57.119568
(1 row)
```

## Load data from a file with default values
Load data from a file with default values

The following example uses a variation of the VENUE table in the TICKIT database. Consider a VENUE\$1NEW table defined with the following statement: 

```
create table venue_new(
venueid smallint not null,
venuename varchar(100) not null,
venuecity varchar(30),
venuestate char(2),
venueseats integer not null default '1000');
```

Consider a venue\$1noseats.txt data file that contains no values for the VENUESEATS column, as shown in the following example: 

```
1|Toyota Park|Bridgeview|IL|
2|Columbus Crew Stadium|Columbus|OH|
3|RFK Stadium|Washington|DC|
4|CommunityAmerica Ballpark|Kansas City|KS|
5|Gillette Stadium|Foxborough|MA|
6|New York Giants Stadium|East Rutherford|NJ|
7|BMO Field|Toronto|ON|
8|The Home Depot Center|Carson|CA|
9|Dick's Sporting Goods Park|Commerce City|CO|
10|Pizza Hut Park|Frisco|TX|
```

The following COPY statement will successfully load the table from the file and apply the DEFAULT value ('1000') to the omitted column: 

```
copy venue_new(venueid, venuename, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_noseats.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|';
```

Now view the loaded table: 

```
select * from venue_new order by venueid;
venueid |         venuename          |    venuecity    | venuestate | venueseats
---------+----------------------------+-----------------+------------+------------
1 | Toyota Park                | Bridgeview      | IL         |       1000
2 | Columbus Crew Stadium      | Columbus        | OH         |       1000
3 | RFK Stadium                | Washington      | DC         |       1000
4 | CommunityAmerica Ballpark  | Kansas City     | KS         |       1000
5 | Gillette Stadium           | Foxborough      | MA         |       1000
6 | New York Giants Stadium    | East Rutherford | NJ         |       1000
7 | BMO Field                  | Toronto         | ON         |       1000
8 | The Home Depot Center      | Carson          | CA         |       1000
9 | Dick's Sporting Goods Park | Commerce City   | CO         |       1000
10 | Pizza Hut Park             | Frisco          | TX         |       1000
(10 rows)
```

For the following example, in addition to assuming that no VENUESEATS data is included in the file, also assume that no VENUENAME data is included: 

```
1||Bridgeview|IL|
2||Columbus|OH|
3||Washington|DC|
4||Kansas City|KS|
5||Foxborough|MA|
6||East Rutherford|NJ|
7||Toronto|ON|
8||Carson|CA|
9||Commerce City|CO|
10||Frisco|TX|
```

 Using the same table definition, the following COPY statement fails because no DEFAULT value was specified for VENUENAME, and VENUENAME is a NOT NULL column: 

```
copy venue(venueid, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|';
```

Now consider a variation of the VENUE table that uses an IDENTITY column: 

```
create table venue_identity(
venueid int identity(1,1),
venuename varchar(100) not null,
venuecity varchar(30),
venuestate char(2),
venueseats integer not null default '1000');
```

As with the previous example, assume that the VENUESEATS column has no corresponding values in the source file. The following COPY statement successfully loads the table, including the predefined IDENTITY data values instead of autogenerating those values: 

```
copy venue(venueid, venuename, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|' explicit_ids;
```

This statement fails because it doesn't include the IDENTITY column (VENUEID is missing from the column list) yet includes an EXPLICIT\$1IDS parameter: 

```
copy venue(venuename, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|' explicit_ids;
```

This statement fails because it doesn't include an EXPLICIT\$1IDS parameter: 

```
copy venue(venueid, venuename, venuecity, venuestate)
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|';
```

## COPY data with the ESCAPE option
COPY data with the ESCAPE option

The following example shows how to load characters that match the delimiter character (in this case, the pipe character). In the input file, make sure that all of the pipe characters (\$1) that you want to load are escaped with the backslash character (\$1). Then load the file with the ESCAPE parameter. 

```
$ more redshiftinfo.txt
1|public\|event\|dwuser
2|public\|sales\|dwuser

create table redshiftinfo(infoid int,tableinfo varchar(50));

copy redshiftinfo from 's3://amzn-s3-demo-bucket/data/redshiftinfo.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
delimiter '|' escape;

select * from redshiftinfo order by 1;
infoid |       tableinfo
-------+--------------------
1      | public|event|dwuser
2      | public|sales|dwuser
(2 rows)
```

Without the ESCAPE parameter, this COPY command fails with an `Extra column(s) found` error.

**Important**  
If you load your data using a COPY with the ESCAPE parameter, you must also specify the ESCAPE parameter with your UNLOAD command to generate the reciprocal output file. Similarly, if you UNLOAD using the ESCAPE parameter, you need to use ESCAPE when you COPY the same data.

## Copy from JSON examples


In the following examples, you load the CATEGORY table with the following data. 

[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/r_COPY_command_examples.html)

**Topics**
+ [

### Load from JSON data using the 'auto' option
](#copy-from-json-examples-using-auto)
+ [

### Load from JSON data using the 'auto ignorecase' option
](#copy-from-json-examples-using-auto-ignorecase)
+ [

### Load from JSON data using a JSONPaths file
](#copy-from-json-examples-using-jsonpaths)
+ [

### Load from JSON arrays using a JSONPaths file
](#copy-from-json-examples-using-jsonpaths-arrays)

### Load from JSON data using the 'auto' option


To load from JSON data using the `'auto'` option, the JSON data must consist of a set of objects. The key names must match the column names, but the order doesn't matter. The following shows the contents of a file named `category_object_auto.json`.

```
{
    "catdesc": "Major League Baseball",
    "catid": 1,
    "catgroup": "Sports",
    "catname": "MLB"
}
{
    "catgroup": "Sports",
    "catid": 2,
    "catname": "NHL",
    "catdesc": "National Hockey League"
}
{
    "catid": 3,
    "catname": "NFL",
    "catgroup": "Sports",
    "catdesc": "National Football League"
}
{
    "bogus": "Bogus Sports LLC",
    "catid": 4,
    "catgroup": "Sports",
    "catname": "NBA",
    "catdesc": "National Basketball Association"
}
{
    "catid": 5,
    "catgroup": "Shows",
    "catname": "Musicals",
    "catdesc": "All symphony, concerto, and choir concerts"
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_auto.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 'auto';
```

### Load from JSON data using the 'auto ignorecase' option


To load from JSON data using the `'auto ignorecase'` option, the JSON data must consist of a set of objects. The case of the key names doesn't have to match the column names and the order doesn't matter. The following shows the contents of a file named `category_object_auto-ignorecase.json`.

```
{
    "CatDesc": "Major League Baseball",
    "CatID": 1,
    "CatGroup": "Sports",
    "CatName": "MLB"
}
{
    "CatGroup": "Sports",
    "CatID": 2,
    "CatName": "NHL",
    "CatDesc": "National Hockey League"
}
{
    "CatID": 3,
    "CatName": "NFL",
    "CatGroup": "Sports",
    "CatDesc": "National Football League"
}
{
    "bogus": "Bogus Sports LLC",
    "CatID": 4,
    "CatGroup": "Sports",
    "CatName": "NBA",
    "CatDesc": "National Basketball Association"
}
{
    "CatID": 5,
    "CatGroup": "Shows",
    "CatName": "Musicals",
    "CatDesc": "All symphony, concerto, and choir concerts"
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_auto ignorecase.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 'auto ignorecase';
```

### Load from JSON data using a JSONPaths file


If the JSON data objects don't correspond directly to column names, you can use a JSONPaths file to map the JSON elements to columns. The order doesn't matter in the JSON source data, but the order of the JSONPaths file expressions must match the column order. Suppose that you have the following data file, named `category_object_paths.json`.

```
{
    "one": 1,
    "two": "Sports",
    "three": "MLB",
    "four": "Major League Baseball"
}
{
    "three": "NHL",
    "four": "National Hockey League",
    "one": 2,
    "two": "Sports"
}
{
    "two": "Sports",
    "three": "NFL",
    "one": 3,
    "four": "National Football League"
}
{
    "one": 4,
    "two": "Sports",
    "three": "NBA",
    "four": "National Basketball Association"
}
{
    "one": 6,
    "two": "Shows",
    "three": "Musicals",
    "four": "All symphony, concerto, and choir concerts"
}
```

The following JSONPaths file, named `category_jsonpath.json`, maps the source data to the table columns.

```
{
    "jsonpaths": [
        "$['one']",
        "$['two']",
        "$['three']",
        "$['four']"
    ]
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_paths.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 's3://amzn-s3-demo-bucket/category_jsonpath.json';
```

### Load from JSON arrays using a JSONPaths file


To load from JSON data that consists of a set of arrays, you must use a JSONPaths file to map the array elements to columns. Suppose that you have the following data file, named `category_array_data.json`.

```
[1,"Sports","MLB","Major League Baseball"]
[2,"Sports","NHL","National Hockey League"]
[3,"Sports","NFL","National Football League"]
[4,"Sports","NBA","National Basketball Association"]
[5,"Concerts","Classical","All symphony, concerto, and choir concerts"]
```

The following JSONPaths file, named `category_array_jsonpath.json`, maps the source data to the table columns.

```
{
    "jsonpaths": [
        "$[0]",
        "$[1]",
        "$[2]",
        "$[3]"
    ]
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_array_data.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 's3://amzn-s3-demo-bucket/category_array_jsonpath.json';
```

## Copy from Avro examples


In the following examples, you load the CATEGORY table with the following data. 

[\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/dg/r_COPY_command_examples.html)

**Topics**
+ [

### Load from Avro data using the 'auto' option
](#copy-from-avro-examples-using-auto)
+ [

### Load from Avro data using the 'auto ignorecase' option
](#copy-from-avro-examples-using-auto-ignorecase)
+ [

### Load from Avro data using a JSONPaths file
](#copy-from-avro-examples-using-avropaths)

### Load from Avro data using the 'auto' option


To load from Avro data using the `'auto'` argument, field names in the Avro schema must match the column names. When using the `'auto'` argument, order doesn't matter. The following shows the schema for a file named `category_auto.avro`.

```
{
    "name": "category",
    "type": "record",
    "fields": [
        {"name": "catid", "type": "int"},
        {"name": "catdesc", "type": "string"},
        {"name": "catname", "type": "string"},
        {"name": "catgroup", "type": "string"},
}
```

The data in an Avro file is in binary format, so it isn't human-readable. The following shows a JSON representation of the data in the `category_auto.avro` file. 

```
{
   "catid": 1,
   "catdesc": "Major League Baseball",
   "catname": "MLB",
   "catgroup": "Sports"
}
{
   "catid": 2,
   "catdesc": "National Hockey League",
   "catname": "NHL",
   "catgroup": "Sports"
}
{
   "catid": 3,
   "catdesc": "National Basketball Association",
   "catname": "NBA",
   "catgroup": "Sports"
}
{
   "catid": 4,
   "catdesc": "All symphony, concerto, and choir concerts",
   "catname": "Classical",
   "catgroup": "Concerts"
}
```

To load from the Avro data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_auto.avro'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as avro 'auto';
```

### Load from Avro data using the 'auto ignorecase' option


To load from Avro data using the `'auto ignorecase'` argument, the case of the field names in the Avro schema does not have to match the case of column names. When using the `'auto ignorecase'` argument, order doesn't matter. The following shows the schema for a file named `category_auto-ignorecase.avro`.

```
{
    "name": "category",
    "type": "record",
    "fields": [
        {"name": "CatID", "type": "int"},
        {"name": "CatDesc", "type": "string"},
        {"name": "CatName", "type": "string"},
        {"name": "CatGroup", "type": "string"},
}
```

The data in an Avro file is in binary format, so it isn't human-readable. The following shows a JSON representation of the data in the `category_auto-ignorecase.avro` file. 

```
{
   "CatID": 1,
   "CatDesc": "Major League Baseball",
   "CatName": "MLB",
   "CatGroup": "Sports"
}
{
   "CatID": 2,
   "CatDesc": "National Hockey League",
   "CatName": "NHL",
   "CatGroup": "Sports"
}
{
   "CatID": 3,
   "CatDesc": "National Basketball Association",
   "CatName": "NBA",
   "CatGroup": "Sports"
}
{
   "CatID": 4,
   "CatDesc": "All symphony, concerto, and choir concerts",
   "CatName": "Classical",
   "CatGroup": "Concerts"
}
```

To load from the Avro data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_auto-ignorecase.avro'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as avro 'auto ignorecase';
```

### Load from Avro data using a JSONPaths file


If the field names in the Avro schema don't correspond directly to column names, you can use a JSONPaths file to map the schema elements to columns. The order of the JSONPaths file expressions must match the column order. 

Suppose that you have a data file named `category_paths.avro` that contains the same data as in the previous example, but with the following schema.

```
{
    "name": "category",
    "type": "record",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "desc", "type": "string"},
        {"name": "name", "type": "string"},
        {"name": "group", "type": "string"},
        {"name": "region", "type": "string"} 
     ]
}
```

The following JSONPaths file, named `category_path.avropath`, maps the source data to the table columns.

```
{
    "jsonpaths": [
        "$['id']",
        "$['group']",
        "$['name']",
        "$['desc']"
    ]
}
```

To load from the Avro data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_paths.avro'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
format avro 's3://amzn-s3-demo-bucket/category_path.avropath ';
```

## Preparing files for COPY with the ESCAPE option
Preparing files for COPY with the ESCAPE option

The following example describes how you might prepare data to "escape" newline characters before importing the data into an Amazon Redshift table using the COPY command with the ESCAPE parameter. Without preparing the data to delimit the newline characters, Amazon Redshift returns load errors when you run the COPY command, because the newline character is normally used as a record separator. 

For example, consider a file or a column in an external table that you want to copy into an Amazon Redshift table. If the file or column contains XML-formatted content or similar data, you need to make sure that all of the newline characters (\$1n) that are part of the content are escaped with the backslash character (\$1). 

A file or table containing embedded newlines characters provides a relatively easy pattern to match. Each embedded newline character most likely always follows a `>` character with potentially some white space characters (`' '` or tab) in between, as you can see in the following example of a text file named `nlTest1.txt`. 

```
$ cat nlTest1.txt
<xml start>
<newline characters provide>
<line breaks at the end of each>
<line in content>
</xml>|1000
<xml>
</xml>|2000
```

With the following example, you can run a text-processing utility to pre-process the source file and insert escape characters where needed. (The `|` character is intended to be used as delimiter to separate column data when copied into an Amazon Redshift table.) 

```
$ sed -e ':a;N;$!ba;s/>[[:space:]]*\n/>\\\n/g' nlTest1.txt > nlTest2.txt
```

Similarly, you can use Perl to perform a similar operation: 

```
cat nlTest1.txt | perl -p -e 's/>\s*\n/>\\\n/g' > nlTest2.txt
```

To accommodate loading the data from the `nlTest2.txt` file into Amazon Redshift, we created a two-column table in Amazon Redshift. The first column c1, is a character column that holds XML-formatted content from the `nlTest2.txt` file. The second column c2 holds integer values loaded from the same file. 

After running the `sed` command, you can correctly load data from the `nlTest2.txt` file into an Amazon Redshift table using the ESCAPE parameter. 

**Note**  
When you include the ESCAPE parameter with the COPY command, it escapes a number of special characters that include the backslash character (including newline). 

```
copy t2 from 's3://amzn-s3-demo-bucket/data/nlTest2.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'  
escape
delimiter as '|';

select * from t2 order by 2;

c1           |  c2
-------------+------
<xml start>
<newline characters provide>
<line breaks at the end of each>
<line in content>
</xml>
| 1000
<xml>
</xml>       | 2000
(2 rows)
```

You can prepare data files exported from external databases in a similar way. For example, with an Oracle database, you can use the REPLACE function on each affected column in a table that you want to copy into Amazon Redshift. 

```
SELECT c1, REPLACE(c2, \n',\\n' ) as c2 from my_table_with_xml
```

In addition, many database export and extract, transform, load (ETL) tools that routinely process large amounts of data provide options to specify escape and delimiter characters. 

## Loading a shapefile into Amazon Redshift
Loading a shapefile

The following examples demonstrate how to load an Esri shapefile using COPY. For more information about loading shapefiles, see [Loading a shapefile into Amazon Redshift](spatial-copy-shapefile.md). 

### Loading a shapefile
Loading a shapefile

The following steps show how to ingest OpenStreetMap data from Amazon S3 using the COPY command. This example assumes that the Norway shapefile archive from [the download site of Geofabrik](https://download.geofabrik.de/europe.html) has been uploaded to a private Amazon S3 bucket in your Amazon Region. The `.shp`, `.shx`, and `.dbf` files must share the same Amazon S3 prefix and file name.

#### Ingesting data without simplification
Ingesting data without simplification

The following commands create tables and ingest data that can fit in the maximum geometry size without any simplification. Open the `gis_osm_natural_free_1.shp` in your preferred GIS software and inspect the columns in this layer. By default, either IDENTITY or GEOMETRY columns are first. When a GEOMETRY column is first, you can create the table as shown following.

```
CREATE TABLE norway_natural (
   wkb_geometry GEOMETRY,
   osm_id BIGINT,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

Or, when an IDENTITY column is first, you can create the table as shown following.

```
CREATE TABLE norway_natural_with_id (
   fid INT IDENTITY(1,1),
   wkb_geometry GEOMETRY,
   osm_id BIGINT,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

Now you can ingest the data using COPY.

```
COPY norway_natural FROM 's3://bucket_name/shapefiles/norway/gis_osm_natural_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO: Load into table 'norway_natural' completed, 83891 record(s) loaded successfully
```

Or you can ingest the data as shown following. 

```
COPY norway_natural_with_id FROM 's3://bucket_name/shapefiles/norway/gis_osm_natural_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO: Load into table 'norway_natural_with_id' completed, 83891 record(s) loaded successfully.
```

#### Ingesting data with simplification
Ingesting data with simplification

The following commands create a table and try to ingest data that can't fit in the maximum geometry size without any simplification. Inspect the `gis_osm_water_a_free_1.shp` shapefile and create the appropriate table as shown following.

```
CREATE TABLE norway_water (
   wkb_geometry GEOMETRY,
   osm_id BIGINT,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

When the COPY command runs, it results in an error.

```
COPY norway_water FROM 's3://bucket_name/shapefiles/norway/gis_osm_water_a_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
ERROR:  Load into table 'norway_water' failed.  Check 'stl_load_errors' system table for details.
```

Querying `STL_LOAD_ERRORS` shows that the geometry is too large. 

```
SELECT line_number, btrim(colname), btrim(err_reason) FROM stl_load_errors WHERE query = pg_last_copy_id();
 line_number |    btrim     |                                 btrim
-------------+--------------+-----------------------------------------------------------------------
     1184705 | wkb_geometry | Geometry size: 1513736 is larger than maximum supported size: 1048447
```

To overcome this, the `SIMPLIFY AUTO` parameter is added to the COPY command to simplify geometries.

```
COPY norway_water FROM 's3://bucket_name/shapefiles/norway/gis_osm_water_a_free_1.shp'
FORMAT SHAPEFILE
SIMPLIFY AUTO
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';

INFO:  Load into table 'norway_water' completed, 1989196 record(s) loaded successfully.
```

To view the rows and geometries that were simplified, query `SVL_SPATIAL_SIMPLIFY`.

```
SELECT * FROM svl_spatial_simplify WHERE query = pg_last_copy_id();
 query | line_number | maximum_tolerance | initial_size | simplified | final_size |   final_tolerance
-------+-------------+-------------------+--------------+------------+------------+----------------------
    20 |     1184704 |                -1 |      1513736 | t          |    1008808 |   1.276386653895e-05
    20 |     1664115 |                -1 |      1233456 | t          |    1023584 | 6.11707814796635e-06
```

Using SIMPLIFY AUTO *max\$1tolerance* with the tolerance lower than the automatically calculated ones probably results in an ingestion error. In this case, use MAXERROR to ignore errors.

```
COPY norway_water FROM 's3://bucket_name/shapefiles/norway/gis_osm_water_a_free_1.shp'
FORMAT SHAPEFILE
SIMPLIFY AUTO 1.1E-05
MAXERROR 2
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';

INFO:  Load into table 'norway_water' completed, 1989195 record(s) loaded successfully.
INFO:  Load into table 'norway_water' completed, 1 record(s) could not be loaded.  Check 'stl_load_errors' system table for details.
```

Query `SVL_SPATIAL_SIMPLIFY` again to identify the record that COPY didn't manage to load.

```
SELECT * FROM svl_spatial_simplify WHERE query = pg_last_copy_id();
 query | line_number | maximum_tolerance | initial_size | simplified | final_size | final_tolerance
-------+-------------+-------------------+--------------+------------+------------+-----------------
    29 |     1184704 |           1.1e-05 |      1513736 | f          |          0 |               0
    29 |     1664115 |           1.1e-05 |      1233456 | t          |     794432 |         1.1e-05
```

In this example, the first record didn’t manage to fit, so the `simplified` column is showing false. The second record was loaded within the given tolerance. However, the final size is larger than using the automatically calculated tolerance without specifying the maximum tolerance. 

### Loading from a compressed shapefile
Loading from a compressed shapefile

Amazon Redshift COPY supports ingesting data from a compressed shapefile. All shapefile components must have the same Amazon S3 prefix and the same compression suffix. As an example, suppose that you want to load the data from the previous example. In this case, the files `gis_osm_water_a_free_1.shp.gz`, `gis_osm_water_a_free_1.dbf.gz`, and `gis_osm_water_a_free_1.shx.gz` must share the same Amazon S3 directory. The COPY command requires the GZIP option, and the FROM clause must specify the correct compressed file, as shown following.

```
COPY norway_natural FROM 's3://bucket_name/shapefiles/norway/compressed/gis_osm_natural_free_1.shp.gz'
FORMAT SHAPEFILE
GZIP
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO:  Load into table 'norway_natural' completed, 83891 record(s) loaded successfully.
```

### Loading data into a table with a different column order
Loading data into a table with a different column order

If you have a table that doesn't have `GEOMETRY` as the first column, you can use column mapping to map columns to the target table. For example, create a table with `osm_id` specified as a first column.

```
CREATE TABLE norway_natural_order (
   osm_id BIGINT,
   wkb_geometry GEOMETRY,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

Then ingest a shapefile using column mapping.

```
COPY norway_natural_order(wkb_geometry, osm_id, code, fclass, name) 
FROM 's3://bucket_name/shapefiles/norway/gis_osm_natural_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO:  Load into table 'norway_natural_order' completed, 83891 record(s) loaded successfully.
```

### Loading data into a table with a geography column
Loading data into a table with a geography column

If you have a table that has a `GEOGRAPHY` column, you first ingest into a `GEOMETRY` column and then cast the objects to `GEOGRAPHY` objects. For example, after you copy your shapefile into a `GEOMETRY` column, alter the table to add a column of the `GEOGRAPHY` data type.

```
ALTER TABLE norway_natural ADD COLUMN wkb_geography GEOGRAPHY;
```

Then convert geometries to geographies.

```
UPDATE norway_natural SET wkb_geography = wkb_geometry::geography;
```

Optionally, you can drop the `GEOMETRY` column.

```
ALTER TABLE norway_natural DROP COLUMN wkb_geometry;
```

## COPY command with the NOLOAD option
COPY command with the NOLOAD option

To validate data files before you actually load the data, use the NOLOAD option with the COPY command. Amazon Redshift parses the input file and displays any errors that occur. The following example uses the NOLOAD option and no rows are actually loaded into the table.

```
COPY public.zipcode1
FROM 's3://amzn-s3-demo-bucket/mydata/zipcode.csv' 
DELIMITER ';' 
IGNOREHEADER 1 REGION 'us-east-1'
NOLOAD
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/myRedshiftRole';

Warnings:
Load into table 'zipcode1' completed, 0 record(s) loaded successfully.
```

## COPY command with a multibyte delimiter and the ENCODING option
COPY command with multibyte delimiter and the ENCODING option

The following example loads LATIN1 from an Amazon S3 file that contains multibyte data. The COPY command specifies the delimiter in octal form `\302\246\303\254` to separate the fields in the input file which is encoded as ISO-8859-1. To specify the same delimiter in UTF-8, specify `DELIMITER '¦ì'`.

```
COPY latin1
FROM 's3://amzn-s3-demo-bucket/multibyte/myfile' 
IAM_ROLE 'arn:aws:iam::123456789012:role/myRedshiftRole'
DELIMITER '\302\246\303\254'
ENCODING ISO88591
```