AWS::Glue::Crawler HudiTarget - Amazon CloudFormation
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

This is the new Amazon CloudFormation Template Reference Guide. Please update your bookmarks and links. For help getting started with CloudFormation, see the Amazon CloudFormation User Guide.

AWS::Glue::Crawler HudiTarget

Specifies an Apache Hudi data source.

Syntax

To declare this entity in your Amazon CloudFormation template, use the following syntax:

JSON

{ "ConnectionName" : String, "Exclusions" : [ String, ... ], "MaximumTraversalDepth" : Integer, "Paths" : [ String, ... ] }

YAML

ConnectionName: String Exclusions: - String MaximumTraversalDepth: Integer Paths: - String

Properties

ConnectionName

The name of the connection to use to connect to the Hudi target. If your Hudi files are stored in buckets that require VPC authorization, you can set their connection properties here.

Required: No

Type: String

Update requires: No interruption

Exclusions

A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.

Required: No

Type: Array of String

Update requires: No interruption

MaximumTraversalDepth

The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Hudi metadata folder in your Amazon S3 path. Used to limit the crawler run time.

Required: No

Type: Integer

Update requires: No interruption

Paths

An array of Amazon S3 location strings for Hudi, each indicating the root folder with which the metadata files for a Hudi table resides. The Hudi folder may be located in a child folder of the root folder.

The crawler will scan all folders underneath a path for a Hudi folder.

Required: No

Type: Array of String

Update requires: No interruption