Use the Athena console to connect to a data source - Amazon Athena
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Use the Athena console to connect to a data source

You can use the Athena console to create and configure a data source connection.

To create a connection to a data source
  1. Open the Athena console at https://console.amazonaws.cn/athena/.

  2. If the console navigation pane is not visible, choose the expansion menu on the left.

    Choose the expansion menu.
  3. In the navigation pane, choose Data sources and catalogs.

  4. On the Data sources and catalogs page, choose Create data source.

  5. For Choose a data source, choose the data source that you want Athena to query, considering the following guidelines:

    • Choose a connection option that corresponds to your data source. Athena has prebuilt data source connectors that you can configure for sources including MySQL, Amazon DocumentDB, and PostgreSQL.

    • Choose S3 - Amazon Glue Data Catalog if you want to query data in Amazon S3 and you are not using an Apache Hive metastore or one of the other federated query data source options on this page. Athena uses the Amazon Glue Data Catalog to store metadata and schema information for data sources in Amazon S3. This is the default (non-federated) option. For more information, see Use Amazon Glue Data Catalog to connect to your data. For steps using this workflow, see Register and use data catalogs in Athena.

    • Choose S3 - Apache Hive metastore to query data sets in Amazon S3 that use an Apache Hive metastore. For more information about this option, see Connect Athena to an Apache Hive metastore.

    • Choose Custom or shared connector if you want to create your own data source connector for use with Athena. For information about writing a data source connector, see Develop a data source connector using the Athena Query Federation SDK.

  6. Choose Next.

  7. On the Enter data source details page, for Data source name, use the name autogenerated name, or enter a unique name that you want to use in your SQL statements when you query the data source from Athena. The name can be up to 127 characters and must be unique within your account. It cannot be changed after you create it. Valid characters are a-z, A-Z, 0-9, _ (underscore), @ (at sign) and - (hyphen). The names awsdatacatalog, hive, jmx, and system are reserved by Athena and cannot be used for data source names.

  8. If the data source you choose integrates with Amazon Glue connections.

    1. For Amazon Glue connection details, enter the information required. A connection contains the properties that are required to connect to a particular data source. The properties required vary depending on the connection type. For more information on properties related to your connector, see Available data source connectors. For information about additional connection properties, see Amazon Glue connection properties in the Amazon Glue User Guide.

      Note
      • When you update the Glue connection properties, the Lambda connector needs to be restarted to get the updated properties. To do this, edit the environment properties and save it without actually changing anything.

      • When you update a Glue connection, the following properties will not automatically get updated in the corresponding Lambda function. You must manually update your Lambda function for these properties.

        • Lambda VPC configuration – security_group_ids, subnet_ids

        • Lambda execution role – spill_bucket, secret_name, spill_kms_key_id

    2. For Lambda execution IAM role, choose one of the following:

      • Create and use a new execution role – (Default) Athena creates an execution role that it will then use to access resources in Amazon Lambda on your behalf. Athena requires this role to create your federated data source.

      • Use an existing execution role – Use this option to choose an existing execution role. For this option, choose execution role that you want to use from Execution role drop-down.

  9. If the data source you choose does not integrate with Amazon Glue connections.

    1. For Lambda function, choose Create Lambda function. The function page for the connector that you chose opens in the Amazon Lambda console. The page includes detailed information about the connector.

    2. Under Application settings, read the description for each application setting carefully, and then enter values that correspond to your requirements.

      The application settings that you see vary depending on the connector for your data source. The minimum required settings include:

      • AthenaCatalogName – A name, in lower case, for the Lambda function that indicates the data source that it targets, such as cloudwatchlogs.

      • SpillBucket – An Amazon S3 bucket in your account to store data that exceeds Lambda function response size limits.

        Note

        Spilled data is not reused in subsequent executions and can be safely deleted. Athena does not delete this data for you. To manage these objects, consider adding an object lifecycle policy that deletes old data from your Amazon S3 spill bucket. For more information, see Managing your storage lifecycle in the Amazon S3 User Guide.

    3. Select I acknowledge that this app creates custom IAM roles and resource policies. For more information, choose the Info link.

    4. Choose Deploy. When the deployment is complete, the Lambda function appears in the Resources section in the Lambda console.

      After you deploy the data source connector to your account, you can connect Athena to it.

    5. Return to the Enter data source details page of the Athena console.

    6. In the Connection details section, choose the refresh icon next to the Select or enter a Lambda function search box.

    7. Choose the name of the function that you just created in the Lambda console. The ARN of the Lambda function displays.

  10. (Optional) For Tags, add key-value pairs to associate with this data source. For more information about tags, see Tag Athena resources.

  11. Choose Next.

  12. On the Review and create page, review the data source details. To make changes, choose Edit.

  13. Read the information in Athena will create resources in your account. If you agree, select I acknowledge that Athena will create resources on my behalf.

  14. Choose Create data source. Athena will create the following resources for you.

    • Lambda execution IAM role

    • Amazon Glue connection (only if the data source is compatible with Amazon Glue Connections)

    • Lambda function

The Data source details section of the page for your data source shows information about your new connector. You can now use the connector in your Athena queries.

For information about using data connectors in queries, see Run federated queries.