AWS::DataBrew::Dataset - Amazon CloudFormation
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

AWS::DataBrew::Dataset

Specifies a new DataBrew dataset.

Syntax

To declare this entity in your Amazon CloudFormation template, use the following syntax:

JSON

{ "Type" : "AWS::DataBrew::Dataset", "Properties" : { "Format" : String, "FormatOptions" : FormatOptions, "Input" : Input, "Name" : String, "PathOptions" : PathOptions, "Tags" : [ Tag, ... ] } }

YAML

Type: AWS::DataBrew::Dataset Properties: Format: String FormatOptions: FormatOptions Input: Input Name: String PathOptions: PathOptions Tags: - Tag

Properties

Format

The file format of a dataset that is created from an Amazon S3 file or folder.

Required: No

Type: String

Allowed values: CSV | JSON | PARQUET | EXCEL | ORC

Update requires: No interruption

FormatOptions

A set of options that define how DataBrew interprets the data in the dataset.

Required: No

Type: FormatOptions

Update requires: No interruption

Input

Information on how DataBrew can find the dataset, in either the Amazon Glue Data Catalog or Amazon S3.

Required: Yes

Type: Input

Update requires: No interruption

Name

The unique name of the dataset.

Required: Yes

Type: String

Minimum: 1

Maximum: 255

Update requires: Replacement

PathOptions

A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.

Required: No

Type: PathOptions

Update requires: No interruption

Tags

Metadata tags that have been applied to the dataset.

Required: No

Type: Array of Tag

Update requires: Replacement

Return values

Ref

When you pass the logical ID of this resource to the intrinsic Ref function, Ref returns the resource name. For example:

{ "Ref": "myDataset" }

For an Amazon Glue DataBrew dataset named myDataset, Ref returns the name of the dataset.

Examples

Creating datasets

The following examples create new DataBrew datasets.

YAML

Resources: TestDataBrewDataset: Type: AWS::DataBrew::Dataset Properties: Name: dataset-name Input: S3InputDefinition: Bucket: !Join [ '', ['databrew-cfn-integration-tests-', !Ref 'AWS::Region', '-', !Ref 'AWS::AccountId' ] ] Key: cocktails.json FormatOptions: Json: MultiLine: True

JSON

{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "This CloudFormation template specifies a DataBrew Dataset", "Resources": { "TestDataBrewDataset": { "Type": "AWS::DataBrew::Dataset", "Properties": { "Name": "cf-test-dataset1", "Input": { "S3InputDefinition": { "Bucket": "test-location", "Key": "test.xlsx" } }, "FormatOptions": { "Excel": { "SheetNames": ["test"] } }, "Tags": [ { "Key": "key00AtCreate", "Value": "value001AtCreate" } ] } } } }