Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://amazonaws-china.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# Using the Amazon Redshift Data API
Using the Data API

The Amazon Redshift Data API simplifies access to your Amazon Redshift data warehouse by removing the need to manage database drivers, connections, network configurations, data buffering, credentials, and more. You can run SQL statements using the Data API operations with the Amazon SDK. For more information about the Data API operations, see the [Amazon Redshift Data API Reference](https://docs.amazonaws.cn/redshift-data/latest/APIReference/).

The Data API doesn't require a persistent connection to your database. Instead, it provides a secure HTTP endpoint and integration with Amazon SDKs. You can use the endpoint to run SQL statements without managing connections. Calls to the Data API are asynchronous. The Data API can use either credentials stored in Amazon Secrets Manager or temporary database credentials. You don't need to pass passwords in the API calls with either authorization method. For more information about Amazon Secrets Manager, see [What Is Amazon Secrets Manager?](https://docs.amazonaws.cn/secretsmanager/latest/userguide/intro.html) in the *Amazon Secrets Manager User Guide*. You can also use Amazon IAM Identity Center for authorization.

With the Data API, you can programmatically access Amazon Redshift data with web services–based applications, including Amazon Lambda, Amazon SageMaker AI notebooks, and Amazon Cloud9. For more information on these applications, see [Amazon Lambda](http://www.amazonaws.cn/lambda/), [Amazon SageMaker AI](http://www.amazonaws.cn/sagemaker/), and [Amazon Cloud9](http://www.amazonaws.cn/cloud9/). 

To learn more about the Data API, see [Get started with the Amazon Redshift Data API](https://amazonaws-china.com/blogs/big-data/get-started-with-the-amazon-redshift-data-api/) in the *Amazon Big Data Blog*.

## Working with the Amazon Redshift Data API
Working with the Data API

Before you use the Amazon Redshift Data API, review the following steps: 

1. Determine if you, as the caller of the Data API, are authorized. For more information about authorization, see [Authorizing access to the Amazon Redshift Data API](data-api-access.md).

1. Determine if you plan to call the Data API with authentication credentials from Secrets Manager, temporary credentials, or use Amazon IAM Identity Center. For more information, see [Choosing database authentication credentials when calling the Amazon Redshift Data API](#data-api-calling-considerations-authentication).

1. Set up a secret if you use Secrets Manager for authentication credentials. For more information, see [Storing database credentials in Amazon Secrets Manager](data-api-secrets.md).

1. Review the considerations and limitations when calling the Data API. For more information, see [Considerations when calling the Amazon Redshift Data API](#data-api-calling-considerations).

1. Call the Data API from the Amazon Command Line Interface (Amazon CLI), from your own code, or using the query editor in the Amazon Redshift console. For examples of calling from the Amazon CLI, see [Calling the Data API](data-api-calling.md).

## Considerations when calling the Amazon Redshift Data API
Considerations when calling the Data API

Consider the following when calling the Data API:
+ The Amazon Redshift Data API can access databases in Amazon Redshift provisioned clusters and Redshift Serverless workgroups. For a list of Amazon Web Services Regions where the Redshift Data API is available, see the endpoints listed for [Redshift Data API](https://docs.amazonaws.cn/general/latest/gr/redshift-service.html) in the *Amazon Web Services General Reference*. The Redshift Data API is also available in China Amazon Web Services Regions.
+ The maximum duration of a query is 24 hours. 
+ The maximum number of active queries (`STARTED` and `SUBMITTED` queries) per Amazon Redshift cluster is 500. 
+ The maximum query result size is 500 MB (after gzip compression). If a call returns more than 500 MB of response data, the call is ended. 
+ The maximum retention time for query results is 24 hours. 
+ The maximum query statement size is 100 KB. 
+ The Data API is available to query single-node and multiple-node clusters of the following node types:
  + dc2.large
  + dc2.8xlarge
  + ra3.large
  + ra3.xlplus
  + ra3.4xlarge
  + ra3.16xlarge
+ The cluster must be in a virtual private cloud (VPC) based on the Amazon VPC service. 
+ By default, users with the same IAM role as the runner of an `ExecuteStatement` or `BatchExecuteStatement` API operation can act on the same statement with `CancelStatement`, `DescribeStatement`, `GetStatementResult`, `GetStatementResultV2`, and `ListStatements` API operations. To act on the same SQL statement from another user, the user must be able to assume the IAM role of the user who ran the SQL statement. For more information about how to assume a role, see [Authorizing access to the Amazon Redshift Data API](data-api-access.md). 
+ The SQL statements in the `Sqls` parameter of `BatchExecuteStatement` API operation are run as a single transaction. They run serially in the order of the array. Subsequent SQL statements don't start until the previous statement in the array completes. If any SQL statement fails, then because they are run as one transaction, all work is rolled back.
+ The maximum retention time for a client token used in `ExecuteStatement` or `BatchExecuteStatement` API operation is 8 hours.
+ If the Amazon Redshift provisioned clusters and Redshift Serverless workgroup is encrypted using a customer managed key Redshift creates a grant that allows the Redshift Data API to use the key for its operations. For for more information, see [Using Amazon KMS with the Amazon Redshift Data API](data-api-kms.md). 
+ Each API in the Redshift Data API has a transactions per second quota before throttling requests. For the quota, see [Quotas for Amazon Redshift Data API](amazon-redshift-limits.md#data-api-quotas-account). If the rate of request exceeds the quota, a `ThrottlingException` with HTTP Status Code: 400 is returned. To respond to throttling, use a retry strategy as described in [Retry behavior](https://docs.amazonaws.cn/sdkref/latest/guide/feature-retry-behavior.html) in the *Amazon SDKs and Tools Reference Guide*. This strategy is implemented automatically for throttling errors in some Amazon SDKs.
**Note**  
By default in Amazon Step Functions, retries are not enabled. If you need to call a Redshift Data API in a Step Functions state machine, then include the `ClientToken` idempotency parameter in your Redshift Data API call. The value of the `ClientToken` needs to persist among retries. In the following example snippet of a request to the `ExecuteStatement` API, the expression `States.ArrayGetItem(States.StringSplit($$.Execution.Id, ':'), 7)` uses an intrinsic function to extract the UUID part of the `$$.Execution.Id`, which is unique for each execution of the state machine. For more information, see [Intrinsic functions](https://docs.amazonaws.cn/step-functions/latest/dg/amazon-states-language-intrinsic-functions.html) in the *Amazon Step Functions Developer Guide*.  

  ```
  {
    "Database": "dev",
    "Sql": "select 1;",
    "ClusterIdentifier": "MyCluster",
    "ClientToken.$": "States.ArrayGetItem(States.StringSplit($$.Execution.Id, ':'), 7)"
  }
  ```

## Choosing database authentication credentials when calling the Amazon Redshift Data API
Choosing database authentication credentials

When you call the Data API, you use one of the following authentication methods for some API operations. Each method requires a different combination of parameters. 

**Amazon IAM Identity Center**  
The Data API can be accessed with a single sign-on user registered in Amazon IAM Identity Center. For information about the steps to set up IAM Identity Center, see [Using Data API with trusted identity propagation](data-api-trusted-identity-propagation.md).

**Amazon Secrets Manager**  
With this method, provide the `secret-arn` of a secret stored in Amazon Secrets Manager which has `username` and `password`. The specified secret contains credentials to connect to the `database` you specify. When you are connecting to a cluster, you also supply the database name, If you provide a cluster identifier (`dbClusterIdentifier`), it must match the cluster identifier stored in the secret. When you are connecting to a serverless workgroup, you also supply the database name. For more information, see [Storing database credentials in Amazon Secrets Manager](data-api-secrets.md).   
With this method, you can also supply a `region` value that specifies the Amazon Web Services Region where your data is located. 

**Temporary credentials**  
With this method, choose one of the following options:  
+ When connecting to a serverless workgroup, specify the workgroup name and database name. The database user name is derived from the IAM identity. For example, `arn:iam::123456789012:user:foo` has the database user name `IAM:foo`. Also, permission to call the `redshift-serverless:GetCredentials` operation is required.
+ When connecting to a cluster as an IAM identity, specify the cluster identifier and the database name. The database user name is derived from the IAM identity. For example, `arn:iam::123456789012:user:foo` has the database user name `IAM:foo`. Also, permission to call the `redshift:GetClusterCredentialsWithIAM` operation is required.
+ When connecting to a cluster as a database user, specify the cluster identifier, the database name, and the database user name. Also, permission to call the `redshift:GetClusterCredentials` operation is required. For information about how to join database groups when connecting with this method, see [Joining database groups when connecting to a cluster](data-api-dbgroups.md).
With this method, you can also supply a `region` value that specifies the Amazon Web Services Region where your data is located. 

## Mapping JDBC data types when calling the Amazon Redshift Data API
Mapping JDBC data types

 The following table maps Java Database Connectivity (JDBC) data types to the data types you specify in Data API calls.


****  

|  JDBC data type  |  Data API data type  | 
| --- | --- | 
|  `INTEGER, SMALLINT, BIGINT`  |  `LONG`  | 
|  `FLOAT, REAL, DOUBLE`  |  `DOUBLE`  | 
|  `DECIMAL`  |  `STRING`  | 
|  `BOOLEAN, BIT`  |  `BOOLEAN`  | 
|  `BLOB, BINARY, LONGVARBINARY`  |  `BLOB`  | 
|  `VARBINARY`  |  `STRING`  | 
|  `CLOB`  |  `STRING`  | 
|  Other types (including types related to date and time)  |  `STRING`  | 

String values are passed to the Amazon Redshift database and implicitly converted into a database data type.

**Note**  
Currently, the Data API doesn't support arrays of universal unique identifiers (UUIDs).

## Running SQL statements with parameters when calling the Amazon Redshift Data API
Running SQL statements with parameters

You can control the SQL text submitted to the database engine by calling the Data API operation using parameters for parts of the SQL statement. Named parameters provide a flexible way to pass in parameters without hardcoding them in the SQL text. They help you reuse SQL text and avoid SQL injection problems.

The following example shows the named parameters of a `parameters` field of an `execute-statement` or `batch-execute-statement` Amazon CLI command.

```
--parameters "[{\"name\": \"id\", \"value\": \"1\"},{\"name\": \"address\", \"value\": \"Seattle\"}]"
```

Consider the following when using named parameters:
+ Named parameters can only be used to replace values in SQL statements.
  + You can replace the values in an INSERT statement, such as `INSERT INTO mytable VALUES(:val1)`.

    The named parameters can be in any order and parameters can be used more than one time in the SQL text. The parameters option shown in a previous example, the values `1` and `Seattle` are inserted into the table columns `id` and `address`. In the SQL text, you specify the named parameters as follows:

    ```
    --sql "insert into mytable values (:id, :address)"
    ```
  + You can replace the values in a conditions clause, such as `WHERE attr >= :val1`, `WHERE attr BETWEEN :val1 AND :val2`, and `HAVING COUNT(attr) > :val`.
  + You can't replace column names in an SQL statement, such as `SELECT column-name`, `ORDER BY column-name`, or `GROUP BY column-name`.

    For example, the following SELECT statement fails with invalid syntax.

    ```
    --sql "SELECT :colname, FROM event" --parameters "[{\"name\": \"colname\", \"value\": \"eventname\"}]"
    ```

    If you describe (`describe-statement` operation) the statement with the syntax error, the `QueryString` returned does not substitute the column name for the parameter (`"QueryString": "SELECT :colname, FROM event"`), and an error is reported (ERROR: syntax error at or near \$1"FROM\$1"\$1n Position: 12).
  + You can't replace column names in an aggregate function, such as `COUNT(column-name)`, `AVG(column-name)`, or `SUM(column-name)`.
  + You can't replace column names in a JOIN clause.
+ When the SQL runs, data is implicitly cast to a data type. For more information about data type casting, see [Data types](https://docs.amazonaws.cn/redshift/latest/dg/c_Supported_data_types.html) in the *Amazon Redshift Database Developer Guide*. 
+ You can't set a value to NULL. The Data API interprets it as the literal string `NULL`. The following example replaces `id` with the literal string `null`. Not the SQL NULL value. 

  ```
  --parameters "[{\"name\": \"id\", \"value\": \"null\"}]"
  ```
+ You can't set a zero length value. The Data API SQL statement fails. The following example trys to set `id` with a zero length value and results in a failure of the SQL statement. 

  ```
  --parameters "[{\"name\": \"id\", \"value\": \"\"}]"
  ```
+ You can't set a table name in the SQL statement with a parameter. The Data API follows the rule of the JDBC `PreparedStatement`. 
+ The output of the `describe-statement` operation returns the query parameters of a SQL statement.
+ Both the `execute-statement` and `batch-execute-statement` operations support SQL statements with parameters. When using `batch-execute-statement`, parameters are shared across all SQL statements in the batch. Each SQL statement can reference a subset of the provided parameters, but every parameter must be used by at least one SQL statement.

## Running SQL statements with an idempotency token when calling the Amazon Redshift Data API
Running SQL statements with an idempotency token

When you make a mutating API request, the request typically returns a result before the operation's asynchronous workflows have completed. Operations might also time out or encounter other server issues before they complete, even though the request has already returned a result. This could make it difficult to determine whether the request succeeded or not, and could lead to multiple retries to ensure that the operation completes successfully. However, if the original request and the subsequent retries are successful, the operation is completed multiple times. This means that you might update more resources than you intended.

*Idempotency* ensures that an API request completes no more than one time. With an idempotent request, if the original request completes successfully, any subsequent retries complete successfully without performing any further actions. The Data API `ExecuteStatement` and `BatchExecuteStatement` operations have an optional `ClientToken` idempotent parameter. The `ClientToken` expires after 8 hours.

**Important**  
If you call `ExecuteStatement` and `BatchExecuteStatement` operations from an Amazon SDK, it automatically generates a client token to use on retry. In this case, we don't recommend using the `client-token` parameter with `ExecuteStatement` and `BatchExecuteStatement` operations. View the CloudTrail log to see the `ClientToken`. For a CloudTrail log example, see [Amazon Redshift Data API examples](logging-with-cloudtrail.md#data-api-cloudtrail).

The following `execute-statement` Amazon CLI command illustrates the optional `client-token` parameter for idempotency.

```
aws redshift-data execute-statement 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --sql "select * from stl_query limit 1" 
    --database dev 
    --client-token b855dced-259b-444c-bc7b-d3e8e33f94g1
```

The following table shows some common responses that you might get for idempotent API requests, and provides retry recommendations.


| Response | Recommendation | Comments | 
| --- | --- | --- | 
|  200 (OK)  |  Do not retry  |  The original request completed successfully. Any subsequent retries return successfully.  | 
|  400-series response codes   |  Do not retry  |  There is a problem with the request, from among the following:  [\[See the AWS documentation website for more details\]](http://docs.amazonaws.cn/en_us/redshift/latest/mgmt/data-api.html) If the request involves a resource that is in the process of changing states, retrying the request could possibly succeed.  | 
|  500-series response codes   |  Retry  |  The error is caused by an Amazon server-side issue and is generally transient. Repeat the request with an appropriate backoff strategy.  | 

For information about Amazon Redshift response codes, see [Common Errors](https://docs.amazonaws.cn/redshift/latest/APIReference/CommonErrors.html) in the *Amazon Redshift API Reference*.

## Running SQL statements with session reuse when calling the Amazon Redshift Data API
Running SQL statements with session reuse

When you make an API request to run a SQL statement, the session where the SQL runs is usually terminated when the SQL is finished. To keep the session active for a specified number of seconds, the Data API `ExecuteStatement` and `BatchExecuteStatement` operations have an optional `SessionKeepAliveSeconds` parameter. A `SessionId` response field contains the identity of the session which can then be used in subsequent `ExecuteStatement` and `BatchExecuteStatement` operations. In subsequent calls you can specify another `SessionKeepAliveSeconds` to change the idle timeout time. If the `SessionKeepAliveSeconds` is not changed, the initial idle timeout setting remains. Consider the following when using session reuse:
+ The maximum value of `SessionKeepAliveSeconds` is 24 hours.
+ The session can last for at most 24 hours. After 24 hours the session is forcibly closed and in-progress queries are terminated.
+ The maximum number of sessions per Amazon Redshift cluster or Redshift Serverless workgroup is 500.
+ You can only run one query at a time in a session. You need to wait until the query is finished to run the next query in the same session. That is, you cannot run queries in parallel in a provided session.
+ The Data API can't queue queries for a given session.

To retrieve the `SessionId` that is used by calls to `ExecuteStatement` and `BatchExecuteStatement` operations, call `DescribeStatement` and `ListStatements` operations.

The following example demonstrates using the `SessionKeepAliveSeconds` and `SessionId` parameters to keep a session alive and reused. First, call the `execute-statement` Amazon CLI command with the optional `session-keep-alive-seconds` parameter set to `2`.

```
aws redshift-data execute-statement 
    --session-keep-alive-seconds 2 
    --sql "select 1" 
    --database dev 
    --workgroup-name mywg
```

The response contains the session identifier.

```
{
    "WorkgroupName": "mywg",
    "CreatedAt": 1703022996.436,
    "Database": "dev",
    "DbUser": "awsuser",
    "Id": "07c5ffea-76d6-4786-b62c-4fe3ef529680",
    "SessionId": "5a254dc6-4fc2-4203-87a8-551155432ee4"
}
```

Then, call the `execute-statement` Amazon CLI command with the `SessionId` returned from the first call. And optionally, specify the `session-keep-alive-seconds` parameter set to `10` to change the idle timeout value.

```
aws redshift-data execute-statement 
    --sql "select 1" 
    --session-id 5a254dc6-4fc2-4203-87a8-551155432ee4
    --session-keep-alive-seconds 10
```

## Fetching the results of SQL statements
Fetching results

You use different Data API operations to fetch SQL results depending on the result format. When you call `ExecuteStatement` and `BatchExecuteStatement` operations, you can specify whether the results are formatted as JSON or CSV. If you don't specify, the default is JSON. To retrieve JSON results, use the `GetStatementResult` operation. To retrieve CSV results, use the `GetStatementResultV2` operation.

Results returned in JSON format are records that include metadata about each column. Each record is in JSON format. For example, the response from `GetStatementResult` looks similar to this:

```
{
   "ColumnMetadata": [ 
      { 
         "isCaseSensitive": false,
         "isCurrency": false,
         "isSigned": true,
         "label": "?column?",
         "name": "?column?",
         "nullable": 1,
         "precision": 10,
         "scale": 0,
         "schemaName": "",
         "tableName": "",
         "typeName": "int4",
         "length": 0
      }
   ],
   "NextToken": "<token>",
   "Records": [
        [
            {
                "longValue": 1
            }
        ]
    ],
   "TotalNumRows": <number>
}
```

Results returned in CSV format are records that include metadata about each column. Results are returned in 1 MB chunks, where each chunk can store any number of rows in CSV format. Each request returns up to 15 MB of results. If results are greater than 15 MB, then a next page token is returned to continue retrieving the results. For example, the response from `GetStatementResultV2` looks similar to this:

```
{
    "ColumnMetadata": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "?column?",
            "name": "?column?",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "",
            "typeName": "int4",
            "length": 0
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "?column?",
            "name": "?column?",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "",
            "typeName": "int4",
            "length": 0
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "?column?",
            "name": "?column?",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "",
            "typeName": "int4",
            "length": 0
        }
    ],
    "NextToken": "<token>",
    "Records": [
        [
            {
                "CSVRecords":"1,2,3\r\n4,5,6\r\n7,8,9\rn, .... 1MB" // First 1MB Chunk
            },
            {
                "CSVRecords":"1025,1026,1027\r\n1028,1029,1030\r\n....2MB" // Second 1MB chunk
            }
            ...
        ]
    ],
    "ResultFormat" : "CSV",
    "TotalNumRows": <number>
}
```

# Authorizing access to the Amazon Redshift Data API
Authorizing access

To access the Data API, a user must be authorized. You can authorize a user to access the Data API by adding a managed policy, which is a predefined Amazon Identity and Access Management (IAM) policy, to that user. As a best practice, we recommend attaching permissions policies to an IAM role and then assigning it to users and groups as needed. For more information, see [Identity and access management in Amazon Redshift](https://docs.amazonaws.cn/redshift/latest/mgmt/redshift-iam-authentication-access-control.html). To see the permissions allowed and denied by managed policies, see the IAM console ([https://console.amazonaws.cn/iam/](https://console.amazonaws.cn/iam/)). 

# Configuring IAM permissions


Amazon Redshift provides the `AmazonRedshiftDataFullAccess` managed policy. This policy provides full access to Amazon Redshift Data API operations. This policy also allows scoped access to specific Amazon Redshift, Amazon Secrets Manager, and IAM API operations needed to authenticate and access an Amazon Redshift cluster or Redshift Serverless workgroup. 

You can also create your own IAM policy that allows access to specific resources. To create your policy, use the `AmazonRedshiftDataFullAccess` policy as your starting template. After you create your policy, add it to each user that requires access to the Data API.

Consider the following requirements of the IAM policy associated with the user:
+ If you use Amazon Secrets Manager to authenticate, confirm the policy allows use of the `secretsmanager:GetSecretValue` action to retrieve the secret tagged with the key `RedshiftDataFullAccess`.
+ If you use temporary credentials to authenticate to a cluster, confirm the policy allows the use of the `redshift:GetClusterCredentials` action to the database user name `redshift_data_api_user` for any database in the cluster. This user name must have already been created in your database.
+ If you use temporary credentials to authenticate to a serverless workgroup, confirm the policy allows the use of the `redshift-serverless:GetCredentials` action to retrieve the workgroup tagged with the key `RedshiftDataFullAccess`. The database user is mapped 1:1 to the source Amazon Identity and Access Management (IAM) identity. For example, the user sample\$1user is mapped to database user `IAM:sample_user`, and IAM role sample\$1role is mapped to `IAMR:sample_role`. For more information about IAM identities, see [IAM Identities (users, user groups, and roles)](https://docs.amazonaws.cn/IAM/latest/UserGuide/id.html) in the IAM User Guide.
+ The IAM action `redshift-data:GetStatementResult` allows access to both `GetStatementResult` and `GetStatementResultV2` API operations.

The following links provide more information about Amazon Identity and Access Management in the *IAM User Guide*.
+ For information about creating an IAM roles, see [Creating IAM roles](https://docs.amazonaws.cn/IAM/latest/UserGuide/id_roles_create.html). 
+ For information about creating an IAM policy, see [Creating IAM policies](https://docs.amazonaws.cn/IAM/latest/UserGuide/access_policies_create.html).
+ For information about adding an IAM policy to a user, see [Adding and removing IAM identity permissions](https://docs.amazonaws.cn/IAM/latest/UserGuide/access_policies_manage-attach-detach.html). 

## Run a query on a cluster that is owned by another account


To run a query on a cluster that is owned by another account, the owning account must provide an IAM role that the Data API can assume in the calling account. For example, suppose Account B owns a cluster that Account A needs to access. Account B can attach the Amazon managed policy `AmazonRedshiftDataFullAccess` to Account B's IAM role. Then Account B trusts Account A using a trust policy such as the following:``

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws-cn:iam::111122223333:role/someRoleA"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

------

Finally, the Account A IAM role needs to be able to assume the Account B IAM role.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": {
        "Effect": "Allow",
        "Action": "sts:AssumeRole",
        "Resource": "arn:aws-cn:iam::111122223333:role/someRoleB"
    }
}
```

------

## Specify an IAM role that restricts resources to Redshift Serverless workgroups and Amazon Redshift clusters in an Amazon Web Services account


You can specify resource ARNs in your identity-based policy to control access to Redshift Serverless workgroups and Amazon Redshift clusters in an Amazon Web Services account. This example shows how you might create a policy that allows access to the Data API for only the workgroup and clusters in the specified Amazon Web Services account.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift-data:CancelStatement",
                "redshift-data:DescribeStatement",
                "redshift-data:GetStatementResult",
                "redshift-data:ListStatements"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "redshift-data:*",
            "Resource": [
                "arn:aws-cn:redshift:us-east-1:111122223333:workgroup/*",
                "arn:aws-cn:redshift:us-east-1:111122223333:cluster:*"
            ]
        }
    ]
}
```

------

## Configure an IAM policy that restricts access to SQL statement information to only the statement owner


By default, Amazon Redshift Data API treats the IAM role used when calling `ExecuteStatement` and `BatchExecuteStatement` as the owner of the SQL statement. Anyone who is allowed to assume the role is able to access information about the SQL statement, including its results. To restrict SQL statement information access to an IAM role session with a particular owner, add condition `redshift-data:statement-owner-iam-userid: "${aws:userid}"`. The following IAM policy restricts access.

------
#### [ JSON ]

****  

```
{
"Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift-data:CancelStatement",
                "redshift-data:DescribeStatement",
                "redshift-data:GetStatementResult",
                "redshift-data:ListStatements"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "redshift-data:statement-owner-iam-userid": "${aws:userid}"
                }
            }
        }
    ]
}
```

------

You can use the condition `statement-owner-iam-userid` with `CancelStatement`, `DescribeStatement`, `GetStatementResult`, and `ListStatements`. For more information, see [Actions defined by Amazon Redshift Data API](https://docs.amazonaws.cn/service-authorization/latest/reference/list_amazonredshiftdataapi.html#amazonredshiftdataapi-redshift-data_statement-owner-iam-userid).

## Configure an IAM policy that restricts access to SQL results to only the session owner


By default, Amazon Redshift Data API treats the IAM role used when calling `ExecuteStatement` and `BatchExecuteStatement` as the owner of the database session that runs the SQL statement. Anyone who is allowed to assume the role is able to submit queries to the database session. To restrict session access to an IAM role session with a particular owner, add condition ` redshift-data:session-owner-iam-userid: "${aws:userid}"`. The following IAM policy restricts access.

The following IAM policy allows only the session owner to get statement results. The condition `session-owner-iam-userid` is used to limit resource access to the specified `userid`.

------
#### [ JSON ]

****  

```
{
"Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [ 
                "redshift-data:ExecuteStatement",
                "redshift-data:BatchExecuteStatement"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "redshift-data:session-owner-iam-userid": "${aws:userid}"
                }
            }
        }
    ]
}
```

------

You can use the condition `session-owner-iam-userid` with `ExecuteStatement` and `BatchExecuteStatement`. For more information, see [Actions defined by Amazon Redshift Data API](https://docs.amazonaws.cn/service-authorization/latest/reference/list_amazonredshiftdataapi.html#amazonredshiftdataapi-redshift-data_statement-owner-iam-userid).

# Storing database credentials in Amazon Secrets Manager
Storing credentials in a secret

When you call the Data API, you can pass credentials for the cluster or serverless workgroup by using a secret in Amazon Secrets Manager. To pass credentials in this way, you specify the name of the secret or the Amazon Resource Name (ARN) of the secret. 

To store credentials with Secrets Manager, you need `SecretManagerReadWrite` managed policy permission. For more information about the minimum permissions, see [Creating and Managing Secrets with Amazon Secrets Manager](https://docs.amazonaws.cn/secretsmanager/latest/userguide/managing-secrets.html) in the *Amazon Secrets Manager User Guide*. 

**To store your credentials in a secret for an Amazon Redshift cluster**

1. Use the Amazon Secrets Manager console to create a secret that contains credentials for your cluster:
   + When you choose **Store a new secret**, choose **Credentials for Redshift cluster**. 
   + Store your values for **User name** (database user), **Password**, and **DB cluster **(cluster identifier) in your secret. 
   + Tag the secret with the key `RedshiftDataFullAccess`. The Amazon managed policy `AmazonRedshiftDataFullAccess` only allows the action `secretsmanager:GetSecretValue` for secrets tagged with the key `RedshiftDataFullAccess`. 

   For instructions, see [Creating a Basic Secret](https://docs.amazonaws.cn/secretsmanager/latest/userguide/manage_create-basic-secret.html) in the *Amazon Secrets Manager User Guide*.

1. Use the Amazon Secrets Manager console to view the details for the secret you created, or run the `aws secretsmanager describe-secret` Amazon CLI command.

   Note the name and ARN of the secret. You can use these in calls to the Data API.

**To store your credentials in a secret for a serverless workgroup**

1. Use Amazon Secrets Manager Amazon CLI commands to store a secret that contains credentials for your serverless workgroup:
   + Create your secret in a file, for example a JSON file named `mycreds.json`. Provide the values for **User name** (database user) and **Password** in the file.

     ```
     {
           "username": "myusername",
           "password": "mypassword"
     }
     ```
   + Store your values in your secret and tag the secret with the key `RedshiftDataFullAccess`.

     ```
     aws secretsmanager create-secret --name MyRedshiftSecret  --tags Key="RedshiftDataFullAccess",Value="serverless" --secret-string file://mycreds.json
     ```

     The following shows the output.

     ```
     {
         "ARN": "arn:aws:secretsmanager:region:accountId:secret:MyRedshiftSecret-mvLHxf",
         "Name": "MyRedshiftSecret",
         "VersionId": "a1603925-e8ea-4739-9ae9-e509eEXAMPLE"
     }
     ```

   For more information, see [Creating a Basic Secret with Amazon CLI](https://docs.amazonaws.cn/secretsmanager/latest/userguide/manage_create-basic-secret.html#proc-create-api) in the *Amazon Secrets Manager User Guide*.

1. Use the Amazon Secrets Manager console to view the details for the secret you created, or run the `aws secretsmanager describe-secret` Amazon CLI command.

   Note the name and ARN of the secret. You can use these in calls to the Data API.

# Creating an Amazon VPC endpoint (Amazon PrivateLink) for the Data API
Creating an Amazon VPC endpoint

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch Amazon resources, such as Amazon Redshift clusters and applications, into a virtual private cloud (VPC). Amazon PrivateLink provides private connectivity between virtual private clouds (VPCs) and Amazon services securely on the Amazon network. Using Amazon PrivateLink, you can create VPC endpoints, which you can use connect to services across different accounts and VPCs based on Amazon VPC. For more information about Amazon PrivateLink, see [VPC Endpoint Services (Amazon PrivateLink)](https://docs.amazonaws.cn/vpc/latest/userguide/endpoint-service.html) in the *Amazon Virtual Private Cloud User Guide*.

You can call the Data API with Amazon VPC endpoints. Using an Amazon VPC endpoint keeps traffic between applications in your Amazon VPC and the Data API in the Amazon network, without using public IP addresses. Amazon VPC endpoints can help you meet compliance and regulatory requirements related to limiting public internet connectivity. For example, if you use an Amazon VPC endpoint, you can keep traffic between an application running on an Amazon EC2 instance and the Data API in the VPCs that contain them.

After you create the Amazon VPC endpoint, you can start using it without making any code or configuration changes in your application.

**To create an Amazon VPC endpoint for the Data API**

1. Sign in to the Amazon Web Services Management Console and open the Amazon VPC console at [https://console.amazonaws.cn/vpc/](https://console.amazonaws.cn/vpc/).

1. Choose **Endpoints**, and then choose **Create Endpoint**.

1. On the **Create Endpoint** page, for **Service category**, choose **Amazon services**. For **Service Name**, choose **redshift-data** (`com.amazonaws.region.redshift-data`).

1. For **VPC**, choose the VPC to create the endpoint in.

   Choose the VPC that contains the application that makes Data API calls.

1. For **Subnets**, choose the subnet for each Availability Zone (AZ) used by the Amazon service that is running your application.

   To create an Amazon VPC endpoint, specify the private IP address range in which the endpoint is accessible. To do this, choose the subnet for each Availability Zone. Doing so restricts the VPC endpoint to the private IP address range specific to each Availability Zone and also creates an Amazon VPC endpoint in each Availability Zone.

1. For **Enable DNS name**, select **Enable for this endpoint**.

   Private DNS resolves the standard Data API DNS hostname (`https://redshift-data.region.amazonaws.com`) to the private IP addresses associated with the DNS hostname specific to your Amazon VPC endpoint. As a result, you can access the Data API VPC endpoint using the Amazon CLI or Amazon SDKs without making any code or configuration changes to update the Data API endpoint URL.

1. For **Security group**, choose a security group to associate with the Amazon VPC endpoint.

   Choose the security group that allows access to the Amazon service that is running your application. For example, if an Amazon EC2 instance is running your application, choose the security group that allows access to the Amazon EC2 instance. The security group enables you to control the traffic to the Amazon VPC endpoint from resources in your VPC.

1. Choose **Create endpoint**.

After the endpoint is created, choose the link in the Amazon Web Services Management Console to view the endpoint details.

The endpoint **Details** tab shows the DNS hostnames that were generated while creating the Amazon VPC endpoint.

You can use the standard endpoint (`redshift-data.region.amazonaws.com`) or one of the VPC-specific endpoints to call the Data API within the Amazon VPC. The standard Data API endpoint automatically routes to the Amazon VPC endpoint. This routing occurs because the Private DNS hostname was enabled when the Amazon VPC endpoint was created.

When you use an Amazon VPC endpoint in a Data API call, all traffic between your application and the Data API remains in the Amazon VPCs that contain them. You can use an Amazon VPC endpoint for any type of Data API call. For information about calling the Data API, see [Considerations when calling the Amazon Redshift Data API](data-api.md#data-api-calling-considerations).

# Joining database groups when connecting to a cluster
Joining database groups

Database groups are collections of database users. Database privileges can be granted to groups. An administrator can configure an IAM role such that these database groups are taken into account when your SQL runs with the Data API. For more information about database groups, see [Groups](https://docs.amazonaws.cn/redshift/latest/dg/r_Groups.html) in the *Amazon Redshift Database Developer Guide*. 

You can configure a Data API caller's IAM role so that the database user specified in the call joins database groups when the Data API connects to a cluster. This capability is only supported when connecting to provisioned clusters. It's not supported when connecting to Redshift Serverless workgroups. The IAM role of the caller of the Data API must also allow the `redshift:JoinGroup` action.

Configure this by adding tags to IAM roles. The administrator of the caller's IAM role adds tags with the key `RedshiftDbGroups` and a key value of a list of database groups. The value is a list of colon (:) separated names of database groups up to a total length of 256 characters. The database groups must be previously defined in the connected database. If any specified group is not found in the database, it's ignored. For example, for database groups `accounting` and `retail`, the key-value is `accounting:retail`. The tag key-value pair `{"Key":"RedshiftDbGroups","Value":"accounting:retail"}` is used by the Data API to determine which database groups are associated with the provided database user in the call to the Data API.

**To join database groups**

1. Sign in to the Amazon Web Services Management Console and open the IAM console at [https://console.amazonaws.cn/iam/](https://console.amazonaws.cn/iam/).

1. In the navigation pane of the console, choose **Roles** and then choose the name of the role that you want to edit.

1. Choose the **Tags** tab, then choose **Manage tags**.

1. Choose **Add tag**, then add the key **RedshiftDbGroups** and a value which is a list of *database-groups-colon-separated*.

1. Choose **Save changes**.

   Now when an IAM principal (with this IAM role attached) calls the Data API, the specified database user joins the database groups specified in the IAM role.

For more information on how to attach a tag to a principal, including IAM roles and IAM users, see [Tagging IAM resources](https://docs.amazonaws.cn/IAM/latest/UserGuide/id_tags.html) in the *IAM User Guide*. 

# Using Data API with trusted identity propagation
Trusted identity propagation

As an Amazon Redshift account administrator, you can integrate your Amazon Redshift cluster or workgroup with Amazon IAM Identity Center, which helps manage your workforce access to Amazon Redshift with single sign-on. For more information, see [Setting up Amazon IAM Identity Center integration with Amazon Redshift](redshift-iam-access-control-idp-connect-console.md). The Amazon Redshift Data API supports propagating IAM Identity Center user identities to an Amazon Redshift cluster or workgroup, and to other services, such as, Amazon Lake Formation, down the chain. You can set up and query using the Data API by following the steps in [Access Amazon services programmatically using trusted identity propagation](https://amazonaws-china.com/blogs//security/access-aws-services-programmatically-using-trusted-identity-propagation/).

When you call the Data API using an IAM Identity Center user identity from an identity-enhanced IAM role session, you can only access the resulting statement and statement result using the same IAM Identity Center user. For example, the following Amazon CLI command calls the `execute-statement` operation to run a SQL command with trusted identity propagation.

```
aws redshift-data execute-statement 
--sql "select current_user;" 
--cluster-id mycluster
--database dev
```

The following Amazon CLI command calls the `batch-execute-statement` operation to run two SQL commands.

```
aws redshift-data batch-execute-statement 
--sqls  "select current_user;"  "select current_date;"
--cluster-id mycluster
--database dev
```

To access statements with `cancel-statement`, `describe-statement`, `get-statement-result`, and `get-statement-result-v2` submitted by identity-enhanced IAM role sessions, the IAM Identity Center user and IAM role must match the credentials used to run `execute-statment` or `batch-execute-statement`. For example, the following Amazon CLI command gets the results of a SQL statement.

```
aws redshift-data get-statement-result 
--id a1b2c3d4-5678-90ab-cdef-EXAMPLE11111
```

To list statements, a `cluster-identifier` or `workgroup-name` parameter must be provided to ensure that the IAM Identity Center user only has access the Amazon Redshift IAM Identity Center applications they are assigned to. For example, the following Amazon CLI command lists statements for a specific cluster.

```
aws redshift-data list-statements
--cluster-identifier mycluster
```

You can also invoke the Data API operations that access database objects in a cluster or workgroup using trusted identity propagation. This includes the `list-databases`, `list-schemas`, `list-tables`, and `describe-table` operations.

API calls made by the IAM Identity Center user can be tracked in Amazon CloudTrail. A `onBehalfOf` section of the CloudTrail event shows the IAM Identity Center user id and the identity store ARN. The following example shows a snippet of a CloudTrail event showing the `onBehalfOf` section with the IAM Identity Center user ID of `a1b2c3d4-5678-90ab-cdef-EXAMPLE11111` and the Identity store ARN of `arn:aws:identitystore::123456789012:identitystore/d-9067bc44d2`.

```
{
            "eventVersion":"1.10",
            "userIdentity":{
            "type":"AssumedRole",
            ...
            },
            "onBehalfOf":{
            "userId":"a1b2c3d4-5678-90ab-cdef-EXAMPLE11111",
            "identityStoreArn":"arn:aws:identitystore::123456789012:identitystore/d-9067bc44d2"
            }
            },
            "eventTime":"2025-01-13T04:46:27Z",
            "eventSource":"redshift-data.amazonaws.com",
            "eventName":"ExecuteStatement",
            "awsRegion":"us-east-1"
            }
```

You can run the following SQL command to check the query submitted by the IAM Identity Center user. In this example, the email registered in Identity Center is `username@example.com`.

```
SELECT
    h.query_id,
    h.database_name,
    h.status,
    h.query_text,
    u.usename,
    h.start_time,
    h.end_time
FROM
    sys_query_history h
LEFT JOIN
    pg_user u
ON
    h.user_id = u.usesysid
where u.usename='awsidc:username@example.com'    
ORDER BY
    h.start_time DESC;
```

# Calling the Data API


You can call the Data API or the Amazon CLI to run SQL statements on your cluster or serverless workgroup. The primary operations to run SQL statements are [https://docs.amazonaws.cn/redshift-data/latest/APIReference/API_ExecuteStatement.html](https://docs.amazonaws.cn/redshift-data/latest/APIReference/API_ExecuteStatement.html) and [https://docs.amazonaws.cn/redshift-data/latest/APIReference/API_BatchExecuteStatement.html](https://docs.amazonaws.cn/redshift-data/latest/APIReference/API_BatchExecuteStatement.html) in the *Amazon Redshift Data API Reference*. The Data API supports the programming languages that are supported by the Amazon SDK. For more information on these, see [Tools to Build on Amazon](https://aws.amazon.com/tools/).

To see code examples of calling the Data API, see [Getting Started with Redshift Data API](https://github.com/aws-samples/getting-started-with-amazon-redshift-data-api#getting-started-with-redshift-data-api) in *GitHub*. This repository has examples of using Amazon Lambda to access Amazon Redshift data from Amazon EC2, Amazon Glue Data Catalog, and Amazon SageMaker Runtime. Example programming languages include Python, Go, Java, and Javascript.

You can call the Data API using the Amazon CLI.

The following examples use the Amazon CLI to call the Data API. To run the examples, edit the parameter values to match your environment. In many of the examples a `cluster-identifier` is provided to run against a cluster. When you run against a serverless workgroup, you provide a `workgroup-name` instead. These examples demonstrate a few of the Data API operations. For more information, see the *Amazon CLI Command Reference*. 

Commands in the following examples have been split and formatted for readability. Not all parameters and responses are shown in all examples. For the API definition of the complete request syntax, request parameters, response syntax, and response elements, see the [Amazon Redshift Data API Reference](https://docs.amazonaws.cn/redshift-data/latest/APIReference/).

# Passing SQL statements to an Amazon Redshift data warehouse
Passing SQL statements to a data warehouse

The examples in this page cover different ways to pass a SQL statement to your data warehouse

## Run a SQL statement


To run a SQL statement, use the `aws redshift-data execute-statement` Amazon CLI command.

The following Amazon CLI command runs a SQL statement against a cluster and returns an identifier to fetch the results. This example uses the Amazon Secrets Manager authentication method.

```
aws redshift-data execute-statement 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --sql "select * from stl_query limit 1" 
    --database dev
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598323175.823,
    "Database": "dev",
    "Id": "c016234e-5c6c-4bc5-bb16-2c5b8ff61814",
    "SecretArn": "arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn"
}
```

The following Amazon CLI command runs a SQL statement against a cluster and returns an identifier to fetch the results. This example uses the temporary credentials authentication method.

```
aws redshift-data execute-statement 
    --db-user myuser 
    --cluster-identifier mycluster-test 
    --database dev 
    --sql "select * from stl_query limit 1"
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598306924.632,
    "Database": "dev",
    "DbUser": "myuser",
    "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766"
}
```

The following Amazon CLI command runs a SQL statement against a serverless workgroup and returns an identifier to fetch the results. This example uses the temporary credentials authentication method.

```
aws redshift-data execute-statement 
    --database dev 
    --workgroup-name myworkgroup 
    --sql "select 1;"
```

The following is an example of the response.

```
{
 "CreatedAt": "2022-02-11T06:25:28.748000+00:00",
 "Database": "dev",
 "DbUser": "IAMR:RoleName",
 "Id": "89dd91f5-2d43-43d3-8461-f33aa093c41e",
 "WorkgroupName": "myworkgroup"
}
```

The following Amazon CLI command runs a SQL statement against a cluster and returns an identifier to fetch the results. This example uses the Amazon Secrets Manager authentication method and an idempotency token.

```
aws redshift-data execute-statement 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --sql "select * from stl_query limit 1" 
    --database dev 
    --client-token b855dced-259b-444c-bc7b-d3e8e33f94g1
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598323175.823,
    "Database": "dev",
    "Id": "c016234e-5c6c-4bc5-bb16-2c5b8ff61814",
    "SecretArn": "arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn"
}
```

## Run multiple SQL statements


To run multiple SQL statements with one command, use the `aws redshift-data batch-execute-statement` Amazon CLI command.

The following Amazon CLI command runs three SQL statements against a cluster and returns an identifier to fetch the results. This example uses the temporary credentials authentication method.

```
aws redshift-data batch-execute-statement 
    --db-user myuser 
    --cluster-identifier mycluster-test 
    --database dev 
    --sqls "set timezone to BST" "select * from mytable" "select * from another_table"
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598306924.632,
    "Database": "dev",
    "DbUser": "myuser",
    "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766"
}
```

## Run statements with parameters


You can use named parameters with both the `execute-statement` and `batch-execute-statement` operations. When using `batch-execute-statement`, parameters are shared across all SQL statements in the batch. Each SQL statement can reference a subset of the provided parameters, but every parameter must be used by at least one SQL statement.

 The following Amazon CLI command runs a SQL statement against a cluster and returns an identifier to fetch the results. This example uses the Amazon Secrets Manager authentication method. The SQL text has named parameter `distance`. In this case, the distance used in the predicate is `5`. In a SELECT statement, named parameters for column names can only be used in the predicate. Values for named parameters for the SQL statement are specified in the `parameters` option.

```
aws redshift-data execute-statement 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --sql "SELECT ratecode FROM demo_table WHERE trip_distance > :distance"  
    --parameters "[{\"name\": \"distance\", \"value\": \"5\"}]"
    --database dev
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598323175.823,
    "Database": "dev",
    "Id": "c016234e-5c6c-4bc5-bb16-2c5b8ff61814",
    "SecretArn": "arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn"
}
```

The following example uses the `EVENT` table from the sample database. For more information, see [EVENT table](https://docs.amazonaws.cn/redshift/latest/dg/r_eventtable.html) in the *Amazon Redshift Database Developer Guide*. 

If you don't already have the `EVENT` table in your database, you can create one using the Data API as follows:

```
aws redshift-data execute-statement 
--database dev
--cluster-id mycluster-test
--db-user awsuser
--sql "create table event( eventid integer not null distkey, 
                           venueid smallint not null, 
                           catid smallint not null, 
                           dateid smallint not null sortkey, 
                           eventname varchar(200), 
                           starttime timestamp)"
```

The following command inserts one row into the `EVENT` table. 

```
aws redshift-data execute-statement 
--database dev
--cluster-id mycluster-test
--db-user awsuser 
--sql "insert into event values(:eventid, :venueid::smallint, :catid, :dateid, :eventname, :starttime)" 
--parameters "[{\"name\": \"eventid\", \"value\": \"1\"}, {\"name\": \"venueid\", \"value\": \"1\"}, 
               {\"name\": \"catid\", \"value\": \"1\"}, 
               {\"name\": \"dateid\", \"value\": \"1\"}, 
               {\"name\": \"eventname\", \"value\": \"event 1\"}, 
               {\"name\": \"starttime\", \"value\": \"2022-02-22\"}]"
```

The following command inserts a second row into the `EVENT` table. This example demonstrates the following: 
+ The parameter named `id` is used four times in the SQL text.
+ Implicit type conversion is applied automatically when inserting parameter `starttime`.
+ The `venueid` column is type cast to SMALLINT data type.
+ Character strings that represent the DATE data type are implicitly converted into the TIMESTAMP data type.
+ Comments can be used within SQL text.

```
aws redshift-data execute-statement 
--database dev
--cluster-id mycluster-test
--db-user awsuser 
--sql "insert into event values(:id, :id::smallint, :id, :id, :eventname, :starttime) /*this is comment, and it won't apply parameterization for :id, :eventname or :starttime here*/" 
--parameters "[{\"name\": \"eventname\", \"value\": \"event 2\"}, 
               {\"name\": \"starttime\", \"value\": \"2022-02-22\"}, 
               {\"name\": \"id\", \"value\": \"2\"}]"
```

The following shows the two inserted rows:

```
 eventid | venueid | catid | dateid | eventname |      starttime
---------+---------+-------+--------+-----------+---------------------
       1 |       1 |     1 |      1 | event 1   | 2022-02-22 00:00:00
       2 |       2 |     2 |      2 | event 2   | 2022-02-22 00:00:00
```

The following command uses a named parameter in a WHERE clause to retrieve the row where `eventid` is `1`. 

```
aws redshift-data execute-statement 
--database dev
--cluster-id mycluster-test
--db-user awsuser 
--sql "select * from event where eventid=:id"
--parameters "[{\"name\": \"id\", \"value\": \"1\"}]"
```

Run the following command to get the SQL results of the previous SQL statement:

```
aws redshift-data get-statement-result --id 7529ad05-b905-4d71-9ec6-8b333836eb5a        
```

Provides the following results:

```
{
    "Records": [
        [
            {
                "longValue": 1
            },
            {
                "longValue": 1
            },
            {
                "longValue": 1
            },
            {
                "longValue": 1
            },
            {
                "stringValue": "event 1"
            },
            {
                "stringValue": "2022-02-22 00:00:00.0"
            }
        ]
    ],
    "ColumnMetadata": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "eventid",
            "length": 0,
            "name": "eventid",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "event",
            "typeName": "int4"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "venueid",
            "length": 0,
            "name": "venueid",
            "nullable": 0,
            "precision": 5,
            "scale": 0,
            "schemaName": "public",
            "tableName": "event",
            "typeName": "int2"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "catid",
            "length": 0,
            "name": "catid",
            "nullable": 0,
            "precision": 5,
            "scale": 0,
            "schemaName": "public",
            "tableName": "event",
            "typeName": "int2"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "dateid",
            "length": 0,
            "name": "dateid",
            "nullable": 0,
            "precision": 5,
            "scale": 0,
            "schemaName": "public",
            "tableName": "event",
            "typeName": "int2"
        },
        {
            "isCaseSensitive": true,
            "isCurrency": false,
            "isSigned": false,
            "label": "eventname",
            "length": 0,
            "name": "eventname",
            "nullable": 1,
            "precision": 200,
            "scale": 0,
            "schemaName": "public",
            "tableName": "event",
            "typeName": "varchar"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "label": "starttime",
            "length": 0,
            "name": "starttime",
            "nullable": 1,
            "precision": 29,
            "scale": 6,
            "schemaName": "public",
            "tableName": "event",
            "typeName": "timestamp"
        }
    ],
    "TotalNumRows": 1
}
```

The following Amazon CLI command runs two SQL statements with shared named parameters against a cluster using `batch-execute-statement`. The SQL text has named parameters `id` and `name`. The first SQL statement uses the `id` parameter and the second uses the `name` parameter.

```
aws redshift-data batch-execute-statement 
    --db-user myuser 
    --cluster-identifier mycluster-test 
    --database dev 
    --sqls "SELECT * FROM mytable WHERE id = :id" "SELECT * FROM mytable WHERE name = :name" 
    --parameters "[{\"name\": \"id\", \"value\": \"1\"},{\"name\": \"name\", \"value\": \"Alice\"}]"
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598306924.632,
    "Database": "dev",
    "DbUser": "myuser",
    "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766"
}
```

The following Amazon CLI command runs two SQL statements where the same parameter is used across both statements. The parameter named `id` is used in an INSERT statement and a SELECT statement.

```
aws redshift-data batch-execute-statement 
    --db-user myuser 
    --cluster-identifier mycluster-test 
    --database dev 
    --sqls "INSERT INTO mytable VALUES(:id, :name, :address)" "SELECT * FROM mytable WHERE id = :id" 
    --parameters "[{\"name\": \"id\", \"value\": \"1\"},{\"name\": \"name\", \"value\": \"Alice\"},{\"name\": \"address\", \"value\": \"Seattle\"}]"
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598306924.632,
    "Database": "dev",
    "DbUser": "myuser",
    "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766"
}
```

# List metadata about SQL statements


To list metadata about SQL statements, use the `aws redshift-data list-statements` Amazon CLI command. Authorization to run this command is based on the caller's IAM permissions.

The following Amazon CLI command lists SQL statements that ran.

```
aws redshift-data list-statements 
    --status ALL
```

The following is an example of the response.

```
{
    "Statements": [
        {
            "CreatedAt": 1598306924.632,
            "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766",
            "QueryString": "select * from stl_query limit 1",
            "Status": "FINISHED",
            "UpdatedAt": 1598306926.667
        },
        {
            "CreatedAt": 1598311717.437,
            "Id": "e0ebd578-58b3-46cc-8e52-8163fd7e01aa",
            "QueryString": "select * from stl_query limit 1",
            "Status": "FAILED",
            "UpdatedAt": 1598311719.008
        },
        {
            "CreatedAt": 1598313683.65,
            "Id": "c361d4f7-8c53-4343-8c45-6b2b1166330c",
            "QueryString": "select * from stl_query limit 1",
            "Status": "ABORTED",
            "UpdatedAt": 1598313685.495
        },
        {
            "CreatedAt": 1598306653.333,
            "Id": "a512b7bd-98c7-45d5-985b-a715f3cfde7f",
            "QueryString": "select 1",
            "Status": "FINISHED",
            "UpdatedAt": 1598306653.992
        }
    ]
}
```

# Describe metadata about a SQL statement


To get descriptions of metadata for a SQL statement, use the `aws redshift-data describe-statement` Amazon CLI command. Authorization to run this command is based on the caller's IAM permissions. 

The following Amazon CLI command describes a SQL statement. 

```
aws redshift-data describe-statement 
    --id d9b6c0c9-0747-4bf4-b142-e8883122f766
```

The following is an example of the response.

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": 1598306924.632,
    "Duration": 1095981511,
    "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766",
    "QueryString": "select * from stl_query limit 1",
    "RedshiftPid": 20859,
    "RedshiftQueryId": 48879,
    "ResultRows": 1,
    "ResultSize": 4489,
    "Status": "FINISHED",
    "UpdatedAt": 1598306926.667
}
```

The following is an example of a `describe-statement` response after running a `batch-execute-statement` command with multiple SQL statements.

```
{
    "ClusterIdentifier": "mayo",
    "CreatedAt": 1623979777.126,
    "Duration": 6591877,
    "HasResultSet": true,
    "Id": "b2906c76-fa6e-4cdf-8c5f-4de1ff9b7652",
    "RedshiftPid": 31459,
    "RedshiftQueryId": 0,
    "ResultRows": 2,
    "ResultSize": 22,
    "Status": "FINISHED",
    "SubStatements": [
        {
            "CreatedAt": 1623979777.274,
            "Duration": 3396637,
            "HasResultSet": true,
            "Id": "b2906c76-fa6e-4cdf-8c5f-4de1ff9b7652:1",
            "QueryString": "select 1;",
            "RedshiftQueryId": -1,
            "ResultRows": 1,
            "ResultSize": 11,
            "Status": "FINISHED",
            "UpdatedAt": 1623979777.903
        },
        {
            "CreatedAt": 1623979777.274,
            "Duration": 3195240,
            "HasResultSet": true,
            "Id": "b2906c76-fa6e-4cdf-8c5f-4de1ff9b7652:2",
            "QueryString": "select 2;",
            "RedshiftQueryId": -1,
            "ResultRows": 1,
            "ResultSize": 11,
            "Status": "FINISHED",
            "UpdatedAt": 1623979778.076
        }
    ],
    "UpdatedAt": 1623979778.183
}
```

# Fetch the results of a SQL statement


To fetch the result from a SQL statement that ran, use the `redshift-data get-statement-result` or `redshift-data get-statement-result-v2` Amazon CLI command. The results from `get-statement-result` are in JSON format. The results from `get-statement-result-v2` are in CSV format. You can provide an `Id` that you receive in response to `execute-statement` or `batch-execute-statement`. The `Id` value for a SQL statement run by `batch-execute-statement` can be retrieved in the result of `describe-statement` and is suffixed by a colon and sequence number such as `b2906c76-fa6e-4cdf-8c5f-4de1ff9b7652:2`. If you run multiple SQL statements with `batch-execute-statement`, each SQL statement has an `Id` value as shown in `describe-statement`. Authorization to run this command is based on the caller's IAM permissions. 

The following statement returns the result of a SQL statement run by `execute-statement` that let the `ResultFormat` default to `JSON`. To retrieve the results, call the `get-statement-result` operation.

```
aws redshift-data get-statement-result 
    --id d9b6c0c9-0747-4bf4-b142-e8883122f766
```

The following statement returns the result of the second SQL statement run by `batch-execute-statement`.

```
aws redshift-data get-statement-result 
    --id b2906c76-fa6e-4cdf-8c5f-4de1ff9b7652:2
```

The following is an example of the response to a call to `get-statement-result` where the SQL result is returned in JSON format in the `Records` key of the response.

```
{
    "ColumnMetadata": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "userid",
            "length": 0,
            "name": "userid",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int4"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "query",
            "length": 0,
            "name": "query",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int4"
        },
        {
            "isCaseSensitive": true,
            "isCurrency": false,
            "isSigned": false,
            "label": "label",
            "length": 0,
            "name": "label",
            "nullable": 0,
            "precision": 320,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "bpchar"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "xid",
            "length": 0,
            "name": "xid",
            "nullable": 0,
            "precision": 19,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int8"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "pid",
            "length": 0,
            "name": "pid",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int4"
        },
        {
            "isCaseSensitive": true,
            "isCurrency": false,
            "isSigned": false,
            "label": "database",
            "length": 0,
            "name": "database",
            "nullable": 0,
            "precision": 32,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "bpchar"
        },
        {
            "isCaseSensitive": true,
            "isCurrency": false,
            "isSigned": false,
            "label": "querytxt",
            "length": 0,
            "name": "querytxt",
            "nullable": 0,
            "precision": 4000,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "bpchar"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "label": "starttime",
            "length": 0,
            "name": "starttime",
            "nullable": 0,
            "precision": 29,
            "scale": 6,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "timestamp"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "label": "endtime",
            "length": 0,
            "name": "endtime",
            "nullable": 0,
            "precision": 29,
            "scale": 6,
            "schemaName": "",
            "tableName": "stll_query",
            "type": 93,
            "typeName": "timestamp"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "aborted",
            "length": 0,
            "name": "aborted",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int4"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "insert_pristine",
            "length": 0,
            "name": "insert_pristine",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int4"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "concurrency_scaling_status",
            "length": 0,
            "name": "concurrency_scaling_status",
            "nullable": 0,
            "precision": 10,
            "scale": 0,
            "schemaName": "",
            "tableName": "stll_query",
            "typeName": "int4"
        }
    ],
    "Records": [
        [
            {
                "longValue": 1
            },
            {
                "longValue": 3
            },
            {
                "stringValue": "health"
            },
            {
                "longValue": 1023
            },
            {
                "longValue": 15279
            },
            {
                "stringValue": "dev"
            },
            {
                "stringValue": "select system_status from stv_gui_status;"
            },
            {
                "stringValue": "2020-08-21 17:33:51.88712"
            },
            {
                "stringValue": "2020-08-21 17:33:52.974306"
            },
            {
                "longValue": 0
            },
            {
                "longValue": 0
            },
            {
                "longValue": 6
            }
        ]
    ],
    "TotalNumRows": 1
}
```

The following example shows a SQL statement run by `execute-statement` to return results as JSON. The table `testingtable` has three integer columns (col1, col2, col3) and there are three rows with values (1, 2, 3), (4, 5, 6), and (7, 8, 9).

```
aws redshift-data execute-statement 
    --database dev 
    --sql "SELECT col1, col2, col3 FROM testingtable" 
    --cluster-id mycluster-test 
    --result-format JSON
```

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": "2024-04-02T16:45:25.144000+00:00",
    "Database": "dev",
    "DbUser": "IAMR:Administrator",
    "Id": "d468d942-6df9-4f85-8ae3-bac01a61aec3"
}
```

The following is an example of the response to a call to `get-statement-result` where the SQL result is returned in JSON format in the `Records` key of the response.

```
aws redshift-data get-statement-result
    --id d468d942-6df9-4f85-8ae3-bac01a61aec3
```

```
{
    "Records": [
        [
            {
                "longValue": 1
            },
            {
                "longValue": 2
            },
            {
                "longValue": 3
            }
        ],
        [
            {
                "longValue": 4
            },
            {
                "longValue": 5
            },
            {
                "longValue": 6
            }
        ],
        [
            {
                "longValue": 7
            },
            {
                "longValue": 8
            },
            {
                "longValue": 9
            }
        ]
    ],
    "ColumnMetadata": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "col1",
            "name": "col1",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "testingtable",
            "typeName": "int4",
            "length": 0
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "col2",
            "name": "col2",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "testingtable",
            "typeName": "int4",
            "length": 0
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "col3",
            "name": "col3",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "testingtable",
            "typeName": "int4",
            "length": 0
        }
    ],
    "TotalNumRows": 3
}
```

The following example shows a SQL statement run by `execute-statement` to return results as a CSV. The table `testingtable` has three integer columns (col1, col2, col3) and there are three rows with values (1, 2, 3), (4, 5, 6), and (7, 8, 9).

```
aws redshift-data execute-statement 
    --database dev 
    --sql "SELECT col1, col2, col3 FROM testingtable" 
    --cluster-id mycluster-test 
    --result-format CSV
```

```
{
    "ClusterIdentifier": "mycluster-test",
    "CreatedAt": "2024-04-02T16:45:25.144000+00:00",
    "Database": "dev",
    "DbUser": "IAMR:Administrator",
    "Id": "d468d942-6df9-4f85-8ae3-bac01a61aec3"
}
```

The following is an example of the response to a call to `get-statement-result-v2` where the SQL result is returned in CSV format in the `Records` key of the response. Rows are separated by carriage return and newline (\$1r\$1n). The first row returned in `Records` are the column headers. Results returned in CSV format are returned in 1 MB where each chunk can store any number of rows up to 1MB. 

```
aws redshift-data get-statement-result-v2
    --id d468d942-6df9-4f85-8ae3-bac01a61aec3
```

```
{
    "Records": [
        {
            "CSVRecords": "col1,col2,col3\r\n1,2,3\r\n4,5,6\r\n7,8,9\r\n"
        }
    ],
    "ColumnMetadata": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "col1",
            "name": "col1",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "testingtable",
            "typeName": "int4",
            "length": 0
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "col2",
            "name": "col2",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "testingtable",
            "typeName": "int4",
            "length": 0
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": true,
            "label": "col3",
            "name": "col3",
            "nullable": 1,
            "precision": 10,
            "scale": 0,
            "schemaName": "public",
            "tableName": "testingtable",
            "typeName": "int4",
            "length": 0
        }
    ],
    "TotalNumRows": 3,
    "ResultFormat": "csv"
}
```

# Describe a table


To get metadata that describes a table, use the `aws redshift-data describe-table` Amazon CLI command.

The following Amazon CLI command runs a SQL statement against a cluster and returns metadata that describes a table. This example uses the Amazon Secrets Manager authentication method.

```
aws redshift-data describe-table  
    --cluster-identifier mycluster-test 
    --database dev 
    --schema information_schema 
    --table sql_features 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn
```

The following is an example of the response.

```
{
    "ColumnList": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "feature_id",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "feature_name",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        }     
    ]
}
```

The following Amazon CLI command runs a SQL statement against a cluster that describes a table. This example uses the temporary credentials authentication method.

```
aws redshift-data describe-table 
    --db-user myuser 
    --cluster-identifier mycluster-test 
    --database dev 
    --schema information_schema 
    --table sql_features
```

The following is an example of the response.

```
{
    "ColumnList": [
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "feature_id",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "feature_name",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "sub_feature_id",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "sub_feature_name",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "is_supported",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "is_verified_by",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        },
        {
            "isCaseSensitive": false,
            "isCurrency": false,
            "isSigned": false,
            "length": 2147483647,
            "name": "comments",
            "nullable": 1,
            "precision": 2147483647,
            "scale": 0,
            "schemaName": "information_schema",
            "tableName": "sql_features",
            "typeName": "character_data"
        }
    ]
}
```

# List databases in a cluster


To list the databases in a cluster, use the `aws redshift-data list-databases` Amazon CLI command.

The following Amazon CLI command runs a SQL statement against a cluster to list databases. This example uses the Amazon Secrets Manager authentication method.

```
aws redshift-data list-databases  

    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --database dev
```

The following is an example of the response.

```
{
    "Databases": [
        "dev"
    ]
}
```

The following Amazon CLI command runs a SQL statement against a cluster to list databases. This example uses the temporary credentials authentication method.

```
aws redshift-data list-databases  
    --db-user myuser 
    --cluster-identifier mycluster-test 
    --database dev
```

The following is an example of the response.

```
{
    "Databases": [
        "dev"
    ]
}
```

# List schemas in a database


To list the schemas in a database, use the `aws redshift-data list-schemas` Amazon CLI command.

The following Amazon CLI command runs a SQL statement against a cluster to list schemas in a database. This example uses the Amazon Secrets Manager authentication method.

```
aws redshift-data list-schemas 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --database dev
```

The following is an example of the response.

```
{
    "Schemas": [
        "information_schema",
        "pg_catalog",
        "pg_internal",
        "public"
    ]
}
```

The following Amazon CLI command runs a SQL statement against a cluster to list schemas in a database. This example uses the temporary credentials authentication method.

```
aws redshift-data list-schemas 
    --db-user mysuser 
    --cluster-identifier mycluster-test 
    --database dev
```

The following is an example of the response.

```
{
    "Schemas": [
        "information_schema",
        "pg_catalog",
        "pg_internal",
        "public"
    ]
}
```

# List tables in a database


To list the tables in a database, use the `aws redshift-data list-tables` Amazon CLI command.

The following Amazon CLI command runs a SQL statement against a cluster to list tables in a database. This example uses the Amazon Secrets Manager authentication method.

```
aws redshift-data list-tables 
    --secret-arn arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn 
    --cluster-identifier mycluster-test 
    --database dev 
    --schema information_schema
```

The following is an example of the response.

```
{
    "Tables": [
        {
            "name": "sql_features",
            "schema": "information_schema",
            "type": "SYSTEM TABLE"
        },
        {
            "name": "sql_implementation_info",
            "schema": "information_schema",
            "type": "SYSTEM TABLE"
        }
}
```

The following Amazon CLI command runs a SQL statement against a cluster to list tables in a database. This example uses the temporary credentials authentication method.

```
aws redshift-data list-tables  

     --db-user myuser 
     --cluster-identifier mycluster-test 
     --database dev 
     --schema information_schema
```

The following is an example of the response.

```
{
    "Tables": [
        {
            "name": "sql_features",
            "schema": "information_schema",
            "type": "SYSTEM TABLE"
        },
        {
            "name": "sql_implementation_info",
            "schema": "information_schema",
            "type": "SYSTEM TABLE"
        }
    ]
}
```

# Troubleshooting issues for Amazon Redshift Data API
Troubleshooting Data API issues

Use the following sections, titled with common error messages, to help troubleshoot problems that you have with the Data API. 

**Topics**
+ [

## Packet for query is too large
](#data-api-troubleshooting-packet-too-large)
+ [

## Database response exceeded size limit
](#data-api-troubleshooting-response-size-too-large)

## Packet for query is too large


If you see an error indicating that the packet for a query is too large, generally the result set returned for a row is too large. The Data API size limit is 64 KB per row in the result set returned by the database.

To solve this issue, make sure that each row in a result set is 64 KB or less.

## Database response exceeded size limit


If you see an error indicating that the database response has exceeded the size limit, generally the size of the result set returned by the database was too large. The Data API limit is 500 MB in the result set returned by the database.

To solve this issue, make sure that calls to the Data API return 500 MB of data or less. If you need to return more than 500 MB, you can run multiple statement calls with the `LIMIT` clause in your query.

# Scheduling Amazon Redshift Data API operations with Amazon EventBridge
Scheduling Data API operations with Amazon EventBridge

You can create rules that match selected events and route them to targets to take action. You can also use rules to take action on a predetermined schedule. For more information, see the [Amazon EventBridge User Guide](https://docs.amazonaws.cn/eventbridge/latest/userguide/). 

To schedule Data API operations with EventBridge, the associated IAM role must trust the principal for CloudWatch Events (events.amazonaws.com). This role should have the equivalent of the managed policy `AmazonEventBridgeFullAccess` attached. It should also have `AmazonRedshiftDataFullAccess` policy permissions that are managed by the Data API. You can create an IAM role with these permissions on the IAM console. When creating a role on the IAM console, choose the Amazon service trusted entity for CloudWatch Events. Specify the IAM role in the `RoleArn` JSON value in the EventBridge target. For more information about creating an IAM role, see [Creating a Role for an Amazon Service (Console)](https://docs.amazonaws.cn/IAM/latest/UserGuide/id_roles_create_for-service.html#roles-creatingrole-service-console) in the *IAM User Guide*.

The `name` of the rule that you create in Amazon EventBridge must match the `StatementName` in the `RedshiftDataParameters`.

The following examples show variations of creating EventBridge rules with a single or multiple SQL statements and with an Amazon Redshift cluster or an Amazon Redshift Serverless workgroup as the data warehouse.

## Calling with a single SQL statement and cluster


The following example uses the Amazon CLI to create an EventBridge rule that is used to run a SQL statement against an Amazon Redshift cluster.

```
aws events put-rule 
--name test-redshift-cluster-data 
--schedule-expression "rate(1 minute)"
```

Then an EventBridge target is created to run on the schedule specified in the rule. 

```
aws events put-targets 
--cli-input-json file://data.json
```

The input data.json file is as follows. The `Sql` JSON key indicates there is a single SQL statement. The `Arn` JSON value contains a cluster identifier. The `RoleArn` JSON value contains the IAM role used to run the SQL as described previously. 

```
{
    "Rule": "test-redshift-cluster-data",
    "EventBusName": "default",
    "Targets": [
        {
            "Id": "2",
            "Arn": "arn:aws:redshift:us-east-1:123456789012:cluster:mycluster",
            "RoleArn": "arn:aws:iam::123456789012:role/Administrator",
            "RedshiftDataParameters": {
                "Database": "dev",
                "DbUser": "root",
                "Sql": "select 1;",
                "StatementName": "test-redshift-cluster-data",
                "WithEvent": true
            }
        }
    ]
}
```

## Calling with a single SQL statement and workgroup


The following example uses the Amazon CLI to create an EventBridge rule that is used to run a SQL statement against an Amazon Redshift Serverless workgroup.

```
aws events put-rule 
--name  test-redshift-serverless-workgroup-data 
--schedule-expression "rate(1 minute)"
```

Then an EventBridge target is created to run on the schedule specified in the rule. 

```
aws events put-targets 
--cli-input-json file://data.json
```

The input data.json file is as follows. The `Sql` JSON key indicates there is a single SQL statement. The `Arn` JSON value contains a workgroup name. The `RoleArn` JSON value contains the IAM role used to run the SQL as described previously. 

```
{
    "Rule": "test-redshift-serverless-workgroup-data", 
    "EventBusName": "default", 
    "Targets": [ 
        {
            "Id": "2",
            "Arn": "arn:aws:redshift-serverless:us-east-1:123456789012:workgroup/a1b2c3d4-5678-90ab-cdef-EXAMPLE11111",
            "RoleArn": "arn:aws:iam::123456789012:role/Administrator", 
            "RedshiftDataParameters": {
                "Database": "dev",
                "Sql": "select 1;",
                "StatementName": "test-redshift-serverless-workgroup-data", 
                "WithEvent": true 
            } 
        } 
    ] 
}
```

## Calling with multiple SQL statements and cluster


The following example uses the Amazon CLI to create an EventBridge rule that is used to run multiple SQL statements against an Amazon Redshift cluster.

```
aws events put-rule 
--name  test-redshift-cluster-data 
--schedule-expression "rate(1 minute)"
```

Then an EventBridge target is created to run on the schedule specified in the rule. 

```
aws events put-targets 
--cli-input-json file://data.json
```

The input data.json file is as follows. The `Sqls` JSON key indicates there are multiple SQL statements. The `Arn` JSON value contains a cluster identifier. The `RoleArn` JSON value contains the IAM role used to run the SQL as described previously. 

```
{
    "Rule": "test-redshift-cluster-data", 
    "EventBusName": "default", 
    "Targets": [ 
        {
            "Id": "2",
            "Arn": "arn:aws:redshift:us-east-1:123456789012:cluster:mycluster",
            "RoleArn": "arn:aws:iam::123456789012:role/Administrator", 
            "RedshiftDataParameters": {
                "Database": "dev",
                "Sqls": ["select 1;", "select 2;", "select 3;"],
                "StatementName": "test-redshift-cluster-data", 
                "WithEvent": true 
            } 
        } 
    ] 
}
```

## Calling with multiple SQL statements and workgroup


The following example uses the Amazon CLI to create an EventBridge rule that is used to run multiple SQL statements against an Amazon Redshift Serverless workgroup.

```
aws events put-rule 
--name  test-redshift-serverless-workgroup-data 
--schedule-expression "rate(1 minute)"
```

Then an EventBridge target is created to run on the schedule specified in the rule. 

```
aws events put-targets 
--cli-input-json file://data.json
```

The input data.json file is as follows. The `Sqls` JSON key indicates there are multiple SQL statements. The `Arn` JSON value contains a workgroup name. The `RoleArn` JSON value contains the IAM role used to run the SQL as described previously. 

```
{
    "Rule": "test-redshift-serverless-workgroup-data", 
    "EventBusName": "default", 
    "Targets": [ 
        {
            "Id": "2",
            "Arn": "arn:aws:redshift-serverless:us-east-1:123456789012:workgroup/a1b2c3d4-5678-90ab-cdef-EXAMPLE11111",
            "RoleArn": "arn:aws:iam::123456789012:role/Administrator", 
            "RedshiftDataParameters": {
                "Database": "dev",
                "Sqls": ["select 1;", "select 2;", "select 3;"],
                "StatementName": "test-redshift-serverless-workgroup-data", 
                "WithEvent": true 
            } 
        } 
    ] 
}
```

# Monitoring the Data API


Monitoring is an important part of maintaining the reliability, availability, and performance of the Data API and your other Amazon solutions. Amazon provides the following monitoring tools to watch the Data API, report when something is wrong, and take automatic actions when appropriate: 
+ Amazon EventBridge can be used to automate your Amazon services and respond automatically to system events, such as application availability issues or resource changes. Events from Amazon services are delivered to EventBridge in near-real time. You can write simple rules to indicate which events are of interest to you and which automated actions to take when an event matches a rule. For more information, see the [Amazon EventBridge User Guide](https://docs.amazonaws.cn/eventbridge/latest/userguide/). 
+ Amazon CloudTrail captures API calls and related events made by or on behalf of your Amazon account and delivers the log files to an Amazon S3 bucket that you specify. You can identify which users and accounts called Amazon, the source IP address from which the calls were made, and when the calls occurred. To learn more about how Amazon Redshift is integrated with Amazon CloudTrail, see [Logging with CloudTrail](https://docs.amazonaws.cn/redshift/latest/mgmt/logging-with-cloudtrail.html). For more information about CloudTrail, see the [Amazon CloudTrail User Guide](https://docs.amazonaws.cn/awscloudtrail/latest/userguide/). 

**Topics**
+ [

# Monitoring events for the Amazon Redshift Data API in Amazon EventBridge
](data-api-monitoring-events.md)

# Monitoring events for the Amazon Redshift Data API in Amazon EventBridge
Monitoring Data API events in Amazon EventBridge

You can monitor Data API events in EventBridge, which delivers a stream of real-time data from your own applications, software-as-a-service (SaaS) applications, and Amazon services. EventBridge routes that data to targets such as Amazon Lambda and Amazon SNS. These events are the same as those that appear in CloudWatch Events, which delivers a near-real time stream of system events that describe changes in Amazon resources. Events are sent to the account that contains the Amazon Redshift database. For example, if you assume a role in another account, events are sent to that account. For more information, see [Amazon EventBridge events](https://docs.amazonaws.cn/eventbridge/latest/userguide/eb-events.html) in the *Amazon EventBridge User Guide.*. 

Data API events are sent when the `ExecuteStatement` or `BatchExecuteStatement` API operation sets the `WithEvent` option to `true`. The `state` field of the event contains one of the following values: 
+ ABORTED – The query run was stopped by the user. 
+ FAILED – The query run failed. 
+ FINISHED – The query has finished running. 

Events are delivered on a guaranteed basis. For more information, see [Events from Amazon services](https://docs.amazonaws.cn/eventbridge/latest/userguide/eb-service-event.html) in the *Amazon EventBridge User Guide*. 

## Example for Data API finished event


The following example shows an event for the Data API when the `ExecuteStatement` API operation finished. In this example, a statement named `test.testtable` finished running.

```
{
    "version": "0",
    "id": "18e7079c-dd4b-dd64-caf9-e2a31640dab0",
    "detail-type": "Redshift Data Statement Status Change",
    "source": "aws.redshift-data",
    "account": "123456789012",
    "time": "2020-10-01T21:14:26Z",
    "region": "us-east-1",
    "resources": [
        "arn:aws:redshift:us-east-1:123456789012:cluster:redshift-cluster-1"
    ],
    "detail": {
        "principal": "arn:aws:iam::123456789012:user/myuser",
        "statementName": "test.testtable",
        "statementId": "dd2e1ec9-2ee3-49a0-819f-905fa7d75a4a",
        "redshiftQueryId": -1,
        "state": "FINISHED",
        "rows": 1,
        "expireAt": 1601673265
    }
}
```

# Using Amazon KMS with the Amazon Redshift Data API
Using Amazon KMS with the Data API

When you encrypt your Amazon Redshift cluster or Redshift Serverless workgroup with a customer managed key, the Amazon Redshift Data API uses that same customer managed key to store and encrypt your queries and results.

The Data API encrypts your data by default to protect sensitive information, such as query text and query results. It uses Amazon KMS encryption keys owned by Amazon for this protection.

Default encryption for data at rest reduces operational overhead and complexity when you protect sensitive data. This approach helps you build secure applications that meet strict encryption compliance and regulatory requirements.

## Using grants in Amazon KMS


The Data API requires a grant to use your customer managed key.

When you call `ExecuteStatement` or `BatchExecuteStatement` against a cluster encrypted with a customer managed key, Amazon Redshift creates a grant on your behalf by sending a [https://docs.amazonaws.cn/kms/latest/APIReference/API_CreateGrant.html](https://docs.amazonaws.cn/kms/latest/APIReference/API_CreateGrant.html) request to Amazon KMS. Amazon KMS uses grants to give the Data API access to a KMS key in your account.

The Data API requires the grant to use your customer managed key for the following operations:
+ Send [https://docs.amazonaws.cn/kms/latest/APIReference/API_Encrypt.html](https://docs.amazonaws.cn/kms/latest/APIReference/API_Encrypt.html) requests to Amazon KMS to encrypt query metadata with your customer managed key.
+ Send [https://docs.amazonaws.cn/kms/latest/APIReference/API_GenerateDataKey.html](https://docs.amazonaws.cn/kms/latest/APIReference/API_GenerateDataKey.html) requests to Amazon KMS to generate data keys encrypted by your customer managed key.
+ Send [https://docs.amazonaws.cn/kms/latest/APIReference/API_Decrypt.html](https://docs.amazonaws.cn/kms/latest/APIReference/API_Decrypt.html) requests to Amazon KMS to decrypt the encrypted data keys so they can encrypt your data.

You can revoke access to the grant or remove Amazon Redshift access to your customer managed key at any time. If you do, the Data API can no longer access data encrypted by your customer managed key, which affects operations that depend on that data. For example, if you try to retrieve query results or track query status after revoking the grant, the Data API returns an `AccessDeniedException`.

## Key policies for your customer managed key


Key policies control access to your customer managed key. Every customer managed key must have exactly one key policy, which contains statements that determine who can use the key and how they can use it. When you create your customer managed key, you can specify a key policy. For more information, see [Customer managed keys](https://docs.amazonaws.cn/kms/latest/developerguide/concepts.html#customer-mgn-key) in the *Amazon Key Management Service Developer Guide*.

To use your customer managed keys with the Data API, you must first allow access to Amazon Redshift. The following API operations must be permitted in the key policy:
+ `kms:CreateGrant` – Adds a grant to a customer managed key. Grants control access to a specified Amazon KMS key, which allows access to grant operations that Amazon Redshift requires. For more information, see [Using grants in Amazon KMS](https://docs.amazonaws.cn/kms/latest/developerguide/grants.html#terms-grant-operations).

The following is an example key policy:

```
"Statement":[
   {
      "Sid":"Allow access to principals authorized to use Amazon Redshift",
      "Effect":"Allow",
      "Principal":{
         "AWS":"*"
      },
      "Action":[
         "kms:DescribeKey",
         "kms:CreateGrant"
      ],
      "Resource":"*",
      "Condition":{
         "StringEquals":{
            "kms:ViaService":"redshift.amazonaws.com",
            "kms:CallerAccount":"111122223333"
         }
      }
   },
   {
      "Sid":"AllowKeyAdministratorsAccess",
      "Effect":"Allow",
      "Principal":{
         "AWS":"arn:aws:iam::111122223333:role/ExampleAdminRole"
      },
      "Action":"kms:*",
      "Resource":"*"
   },
   {
      "Sid":"AllowKeyUseForExampleRole",
      "Effect":"Allow",
      "Principal":{
         "AWS":"arn:aws:iam::111122223333:role/ExampleUserRole"
      },
      "Action":[
         "kms:Encrypt",
         "kms:Decrypt",
         "kms:ReEncrypt*",
         "kms:GenerateDataKey*",
         "kms:DescribeKey"
      ],
      "Resource":"*"
   }
]
```

## Data API encryption context


An encryption context is an optional set of key-value pairs that contains additional contextual information about the data. Amazon KMS uses the encryption context as additional authenticated data to support authenticated encryption. When you include an encryption context in a request to encrypt data, Amazon KMS binds the encryption context to the encrypted data. To decrypt the data, you must include the same encryption context in the request. 

The Data API uses the same three encryption context key-value pairs in all Amazon KMS cryptographic operations for provisioned clusters:
+ `aws:redshift:arn` – The cluster's Amazon Resource Name (ARN)
+ `aws:redshift:createtime` – The timestamp when you requested cluster creation
+ `serviceName` – `RedshiftDataAPI`

```
"EncryptionContextSubset": {
    "aws:redshift:arn": "arn:aws:redshift:us-east-1:123456789012:cluster:redshift-cluster",
    "aws:redshift:createtime": "20250815T0000Z",
    "serviceName": "RedshiftDataAPI",
}
```

The Data API uses two encryption context key-value pairs in all Amazon KMS cryptographic operations for serverless workgroups:
+ `aws:redshift-serverless:arn` – The namespace's Amazon Resource Name (ARN)
+ `serviceName` – RedshiftDataAPI

```
"EncryptionContextSubset": {
    "aws:redshift-serverless:arn": "arn:aws:redshift-serverless:us-east-1:123456789012:namespace:12345678-1234-1234-1234-123456789012",
    "serviceName": "RedshiftDataAPI"
}
```

For more information about encryption, see [Introduction to the cryptographic details of Amazon KMS](https://docs.amazonaws.cn/kms/latest/cryptographic-details/intro.html). For more information about the Amazon Redshift and Amazon KMS integration, see [How Amazon Redshift uses Amazon KMS](https://docs.amazonaws.cn/kms/latest/developerguide/services-redshift.html).