Converting to columnar formats
Your Amazon Athena query performance improves if you convert your data into open source
columnar formats, such as Apache parquet
Options for easily converting source data such as JSON or CSV into a columnar format include using CREATE TABLE AS queries or running jobs in Amazon Glue.
-
You can use
CREATE TABLE AS
(CTAS) queries to convert data into Parquet or ORC in one step. For an example, see Example: Writing query results to a different format on the Examples of CTAS queries page. -
For information about running an Amazon Glue job to transform CSV data to Parquet, see the section "Transform the data from CSV to Parquet format" in the Amazon Big Data blog post Build a data lake foundation with Amazon Glue and Amazon S3
. Amazon Glue supports using the same technique to convert CSV data to ORC, or JSON data to either Parquet or ORC.