

# Using Delta Lake framework in Amazon Glue Studio
<a name="gs-data-lake-formats-delta"></a>

## Using Delta Lake framework in data sources
<a name="gs-data-lake-formats-delta-source"></a>

### Using Delta Lake framework in Amazon S3 data sources
<a name="gs-data-lake-formats-delta-lake-s3-data-source"></a>

1.  From the Source menu, choose Amazon S3. 

1.  If you choose Data Catalog table as the Amazon S3 source type, choose a database and table. 

1.  Amazon Glue Studio displays the format as Delta Lake and the Amazon S3 URL. 

1.  Choose **Additional options** to enter a key-value pair. For example, a key-value pair could be: **key**: timestampAsOf and **value**: 2023-02-24 14:16:18.   
![\[The screenshot shows the Additional options section in the Data source properties tab for an Amazon S3 data source node.\]](http://docs.amazonaws.cn/en_us/glue/latest/dg/images/data_lake_formats_additional_options.png)

1.  If you choose Amazon S3 location as the **Amazon S3 source type**, choose the Amazon S3 URL by clicking **Browse Amazon S3**. 

1.  In **Data format**, choose Delta Lake. 
**Note**  
 If Amazon Glue Studio is unable to infer the schema from the Amazon S3 folder or file you selected, choose **Additional options** to select a new folder or file.   
 In **Additional options** choose from the following options under **Schema inference**:   
 Let Amazon Glue Studio automatically choose a sample file — Amazon Glue Studio will choose a sample file in the Amazon S3 location so that the schema can be inferred. In the **Auto-sampled file** field, you can view the file that was automatically selected. 
 Choose a sample file from Amazon S3 — choose the Amazon S3 file to use by clicking **Browse Amazon S3**. 

1.  Click **Infer schema**. You can then view the output schema by clicking on the **Output schema** tab. 

### Using Delta Lake framework in Data Catalog data sources
<a name="gs-data-lake-formats-delta-catalog"></a>

1.  From the **Source** menu, choose Amazon Glue Studio Data Catalog. 

1.  In the **Data source properties** tab, choose a database and table. 

1.  Amazon Glue Studio displays the format type as Delta Lake and the Amazon S3 URL. 
**Note**  
 If your Delta Lake source is not registered as the Amazon Glue Data Catalog table yet, you have two options:   
 Create a Amazon Glue crawler for the Delta Lake data store. For more information, see [ How to specify configuration options for a Delta Lake data store](https://docs.amazonaws.cn/glue/latest/dg/crawler-configuration.html#crawler-delta-lake). 
 Use an Amazon S3 data source to select your Delta Lake data source. See [Using Delta Lake framework in Amazon S3 data sources](#gs-data-lake-formats-delta-lake-s3-data-source). 

## Using Delta Lake formats in data targets
<a name="gs-data-lake-formats-delta-target"></a>

### Using Delta Lake formats in Data Catalog data targets
<a name="gs-data-lake-formats-delta-target-catalog"></a>

1.  From the **Target** menu, choose Amazon Glue Studio Data Catalog. 

1.  In the **Data source properties** tab, choose a database and table. 

1.  Amazon Glue Studio displays the format type as Delta Lake and the Amazon S3 URL. 

### Using Delta Lake formats in Amazon S3 data sources
<a name="gs-data-lake-formats-delta-target-s3"></a>

 Enter values or select from the available options to configure Delta Lake format. 
+  **Compression Type** — choose from one of the compression type options: Uncompressed or Snappy. 
+  **Amazon S3 Target Location** — choose the Amazon S3 target location by clicking **Browse S3**. 
+  **Data Catalog update options** — updating the Data Catalog is not supported for this format in the Glue Studio visual editor. 
  +  Do not update the Data Catalog: (Default) Choose this option if you don't want the job to update the Data Catalog, even if the schema changes or new partitions are added. 
  +  To update the Data Catalog after the Amazon Glue job execution, run or schedule a Amazon Glue crawler. For more information, see [ How to specify configuration options for a Delta Lake data store](https://docs.amazonaws.cn/glue/latest/dg/crawler-configuration.html#crawler-delta-lake). 
+  **Partition keys** — Choose which columns to use as partitioning keys in the output. To add more partition keys, choose **Add a partition key**. 
+  Optionally, choose **Addtional options** to enter a key-value pair. For example, a key-value pair could be: **key**: timestampAsOf and **value**: 2023-02-24 14:16:18. 