You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::Glue::Types::CreateCrawlerRequest

Inherits:

Struct

Object
Struct
Aws::Glue::Types::CreateCrawlerRequest

show all

Defined in:: (unknown)

Overview

Note:

When passing CreateCrawlerRequest as input to an Aws::Client method, you can use a vanilla Hash:

{
  name: "NameString", # required
  role: "Role", # required
  database_name: "DatabaseName",
  description: "DescriptionString",
  targets: { # required
    s3_targets: [
      {
        path: "Path",
        exclusions: ["Path"],
        connection_name: "ConnectionName",
      },
    ],
    jdbc_targets: [
      {
        connection_name: "ConnectionName",
        path: "Path",
        exclusions: ["Path"],
      },
    ],
    mongo_db_targets: [
      {
        connection_name: "ConnectionName",
        path: "Path",
        scan_all: false,
      },
    ],
    dynamo_db_targets: [
      {
        path: "Path",
        scan_all: false,
        scan_rate: 1.0,
      },
    ],
    catalog_targets: [
      {
        database_name: "NameString", # required
        tables: ["NameString"], # required
      },
    ],
  },
  schedule: "CronExpression",
  classifiers: ["NameString"],
  table_prefix: "TablePrefix",
  schema_change_policy: {
    update_behavior: "LOG", # accepts LOG, UPDATE_IN_DATABASE
    delete_behavior: "LOG", # accepts LOG, DELETE_FROM_DATABASE, DEPRECATE_IN_DATABASE
  },
  recrawl_policy: {
    recrawl_behavior: "CRAWL_EVERYTHING", # accepts CRAWL_EVERYTHING, CRAWL_NEW_FOLDERS_ONLY
  },
  configuration: "CrawlerConfiguration",
  crawler_security_configuration: "CrawlerSecurityConfiguration",
  tags: {
    "TagKey" => "TagValue",
  },
}

Instance Attribute Summary collapse

#classifiers ⇒ Array<String>
A list of custom classifiers that the user has registered.
#configuration ⇒ String
Crawler configuration information.
#crawler_security_configuration ⇒ String
The name of the SecurityConfiguration structure to be used by this crawler.
#database_name ⇒ String
The AWS Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*.
#description ⇒ String
A description of the new crawler.
#name ⇒ String
Name of the new crawler.
#recrawl_policy ⇒ Types::RecrawlPolicy
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
#role ⇒ String
The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.
#schedule ⇒ String
A cron expression used to specify the schedule (see [Time-Based Schedules for Jobs and Crawlers][1]. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).
#schema_change_policy ⇒ Types::SchemaChangePolicy
The policy for the crawler\'s update and deletion behavior.
#table_prefix ⇒ String
The table prefix used for catalog tables that are created.
#tags ⇒ Hash<String,String>
The tags to use with this crawler request.
#targets ⇒ Types::CrawlerTargets
A list of collection of targets to crawl.

Instance Attribute Details

#classifiers ⇒ `Array<String>`

A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.

Returns:

(Array<String>) —
A list of custom classifiers that the user has registered.

#configuration ⇒ `String`

Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler\'s behavior. For more information, see Configuring a Crawler.

Returns:

(String) —
Crawler configuration information.

#crawler_security_configuration ⇒ `String`

The name of the SecurityConfiguration structure to be used by this crawler.

Returns:

(String) —
The name of the SecurityConfiguration structure to be used by this crawler.

#database_name ⇒ `String`

The AWS Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*.

Returns:

(String) —
The AWS Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*.

#description ⇒ `String`

A description of the new crawler.

Returns:

(String) —
A description of the new crawler.

#name ⇒ `String`

Name of the new crawler.

Returns:

(String) —
Name of the new crawler.

#recrawl_policy ⇒ `Types::RecrawlPolicy`

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

Returns:

(Types::RecrawlPolicy) —
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

#role ⇒ `String`

The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.

Returns:

(String) —
The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.

#schedule ⇒ `String`

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

Returns:

(String) —
A cron expression used to specify the schedule (see [Time-Based Schedules for Jobs and Crawlers][1]. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

#schema_change_policy ⇒ `Types::SchemaChangePolicy`

The policy for the crawler\'s update and deletion behavior.

Returns:

(Types::SchemaChangePolicy) —
The policy for the crawler\'s update and deletion behavior.

#table_prefix ⇒ `String`

The table prefix used for catalog tables that are created.

Returns:

(String) —
The table prefix used for catalog tables that are created.

#tags ⇒ `Hash<String,String>`

The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.

Returns:

(Hash<String,String>) —
The tags to use with this crawler request.

#targets ⇒ `Types::CrawlerTargets`

A list of collection of targets to crawl.

Returns:

(Types::CrawlerTargets) —
A list of collection of targets to crawl.

Class: Aws::Glue::Types::CreateCrawlerRequest

Overview

Instance Attribute Summary collapse

Instance Attribute Details

#classifiers ⇒ Array<String>

#configuration ⇒ String

#crawler_security_configuration ⇒ String

#database_name ⇒ String

#description ⇒ String

#name ⇒ String

#recrawl_policy ⇒ Types::RecrawlPolicy

#role ⇒ String

#schedule ⇒ String

#schema_change_policy ⇒ Types::SchemaChangePolicy

#table_prefix ⇒ String

#tags ⇒ Hash<String,String>

#targets ⇒ Types::CrawlerTargets

#classifiers ⇒ `Array<String>`

#configuration ⇒ `String`

#crawler_security_configuration ⇒ `String`

#database_name ⇒ `String`

#description ⇒ `String`

#name ⇒ `String`

#recrawl_policy ⇒ `Types::RecrawlPolicy`

#role ⇒ `String`

#schedule ⇒ `String`

#schema_change_policy ⇒ `Types::SchemaChangePolicy`

#table_prefix ⇒ `String`

#tags ⇒ `Hash<String,String>`

#targets ⇒ `Types::CrawlerTargets`