You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::GlueDataBrew::Client

Inherits:

Seahorse::Client::Base

Object
Seahorse::Client::Base
Aws::GlueDataBrew::Client

show all

Defined in:: (unknown)

Overview

An API client for AWS Glue DataBrew. To construct a client, you need to configure a :region and :credentials.

gluedatabrew = Aws::GlueDataBrew::Client.new(
  region: region_name,
  credentials: credentials,
  # ...
)

See #initialize for a full list of supported configuration options.

Region

You can configure a default region in the following locations:

ENV['AWS_REGION']
Aws.config[:region]

Go here for a list of supported regions.

Credentials

Default credentials are loaded automatically from the following locations:

ENV['AWS_ACCESS_KEY_ID'] and ENV['AWS_SECRET_ACCESS_KEY']
Aws.config[:credentials]
The shared credentials ini file at ~/.aws/credentials (more information)
From an instance profile when running on EC2

You can also construct a credentials object from one of the following classes:

Alternatively, you configure credentials with :access_key_id and :secret_access_key:

# load credentials from disk
creds = YAML.load(File.read('/path/to/secrets'))

Aws::GlueDataBrew::Client.new(
  access_key_id: creds['access_key_id'],
  secret_access_key: creds['secret_access_key']
)

Always load your credentials from outside your application. Avoid configuring credentials statically and never commit them to source control.

Instance Attribute Summary

Attributes inherited from Seahorse::Client::Base

#config, #handlers

Constructor collapse

#initialize(options = {}) ⇒ Aws::GlueDataBrew::Client constructor
Constructs an API client.

API Operations collapse

#batch_delete_recipe_version(options = {}) ⇒ Types::BatchDeleteRecipeVersionResponse
Deletes one or more versions of a recipe at a time.

.
#create_dataset(options = {}) ⇒ Types::CreateDatasetResponse
Creates a new AWS Glue DataBrew dataset for this AWS account.

.
#create_profile_job(options = {}) ⇒ Types::CreateProfileJobResponse
Creates a new job to profile an AWS Glue DataBrew dataset that exists in the current AWS account.

.
#create_project(options = {}) ⇒ Types::CreateProjectResponse
Creates a new AWS Glue DataBrew project in the current AWS account.

.
#create_recipe(options = {}) ⇒ Types::CreateRecipeResponse
Creates a new AWS Glue DataBrew recipe for the current AWS account.

.
#create_recipe_job(options = {}) ⇒ Types::CreateRecipeJobResponse
Creates a new job for an existing AWS Glue DataBrew recipe in the current AWS account.
#create_schedule(options = {}) ⇒ Types::CreateScheduleResponse
Creates a new schedule for one or more AWS Glue DataBrew jobs.
#delete_dataset(options = {}) ⇒ Types::DeleteDatasetResponse
Deletes a dataset from AWS Glue DataBrew.

.
#delete_job(options = {}) ⇒ Types::DeleteJobResponse
Deletes the specified AWS Glue DataBrew job from the current AWS account.
#delete_project(options = {}) ⇒ Types::DeleteProjectResponse
Deletes an existing AWS Glue DataBrew project from the current AWS account.

.
#delete_recipe_version(options = {}) ⇒ Types::DeleteRecipeVersionResponse
Deletes a single version of an AWS Glue DataBrew recipe.

.
#delete_schedule(options = {}) ⇒ Types::DeleteScheduleResponse
Deletes the specified AWS Glue DataBrew schedule from the current AWS account.

.
#describe_dataset(options = {}) ⇒ Types::DescribeDatasetResponse
Returns the definition of a specific AWS Glue DataBrew dataset that is in the current AWS account.

.
#describe_job(options = {}) ⇒ Types::DescribeJobResponse
Returns the definition of a specific AWS Glue DataBrew job that is in the current AWS account.

.
#describe_project(options = {}) ⇒ Types::DescribeProjectResponse
Returns the definition of a specific AWS Glue DataBrew project that is in the current AWS account.

.
#describe_recipe(options = {}) ⇒ Types::DescribeRecipeResponse
Returns the definition of a specific AWS Glue DataBrew recipe that is in the current AWS account.

.
#describe_schedule(options = {}) ⇒ Types::DescribeScheduleResponse
Returns the definition of a specific AWS Glue DataBrew schedule that is in the current AWS account.

.
#list_datasets(options = {}) ⇒ Types::ListDatasetsResponse
Lists all of the AWS Glue DataBrew datasets for the current AWS account.

.
#list_job_runs(options = {}) ⇒ Types::ListJobRunsResponse
Lists all of the previous runs of a particular AWS Glue DataBrew job in the current AWS account.

.
#list_jobs(options = {}) ⇒ Types::ListJobsResponse
Lists the AWS Glue DataBrew jobs in the current AWS account.

.
#list_projects(options = {}) ⇒ Types::ListProjectsResponse
Lists all of the DataBrew projects in the current AWS account.

.
#list_recipe_versions(options = {}) ⇒ Types::ListRecipeVersionsResponse
Lists all of the versions of a particular AWS Glue DataBrew recipe in the current AWS account.

.
#list_recipes(options = {}) ⇒ Types::ListRecipesResponse
Lists all of the AWS Glue DataBrew recipes in the current AWS account.

.
#list_schedules(options = {}) ⇒ Types::ListSchedulesResponse
Lists the AWS Glue DataBrew schedules in the current AWS account.

.
#list_tags_for_resource(options = {}) ⇒ Types::ListTagsForResourceResponse
Lists all the tags for an AWS Glue DataBrew resource.
#publish_recipe(options = {}) ⇒ Types::PublishRecipeResponse
Publishes a new major version of an AWS Glue DataBrew recipe that exists in the current AWS account.

.
#send_project_session_action(options = {}) ⇒ Types::SendProjectSessionActionResponse
Performs a recipe step within an interactive AWS Glue DataBrew session that's currently open.

.
#start_job_run(options = {}) ⇒ Types::StartJobRunResponse
Runs an AWS Glue DataBrew job that exists in the current AWS account.

.
#start_project_session(options = {}) ⇒ Types::StartProjectSessionResponse
Creates an interactive session, enabling you to manipulate an AWS Glue DataBrew project.

.
#stop_job_run(options = {}) ⇒ Types::StopJobRunResponse
Stops the specified job from running in the current AWS account.

.
#tag_resource(options = {}) ⇒ Struct
Adds metadata tags to an AWS Glue DataBrew resource, such as a dataset, job, project, or recipe.

.
#untag_resource(options = {}) ⇒ Struct
Removes metadata tags from an AWS Glue DataBrew resource.

.
#update_dataset(options = {}) ⇒ Types::UpdateDatasetResponse
Modifies the definition of an existing AWS Glue DataBrew dataset in the current AWS account.

.
#update_profile_job(options = {}) ⇒ Types::UpdateProfileJobResponse
Modifies the definition of an existing AWS Glue DataBrew job in the current AWS account.

.
#update_project(options = {}) ⇒ Types::UpdateProjectResponse
Modifies the definition of an existing AWS Glue DataBrew project in the current AWS account.

.
#update_recipe(options = {}) ⇒ Types::UpdateRecipeResponse
Modifies the definition of the latest working version of an AWS Glue DataBrew recipe in the current AWS account.

.
#update_recipe_job(options = {}) ⇒ Types::UpdateRecipeJobResponse
Modifies the definition of an existing AWS Glue DataBrew recipe job in the current AWS account.

.
#update_schedule(options = {}) ⇒ Types::UpdateScheduleResponse
Modifies the definition of an existing AWS Glue DataBrew schedule in the current AWS account.

.

Instance Method Summary collapse

#wait_until(waiter_name, params = {}) {|waiter| ... } ⇒ Boolean
Waiters polls an API operation until a resource enters a desired state.
#waiter_names ⇒ Array<Symbol>
Returns the list of supported waiters.

Methods inherited from Seahorse::Client::Base

add_plugin, api, #build_request, clear_plugins, define, new, #operation, #operation_names, plugins, remove_plugin, set_api, set_plugins

Methods included from Seahorse::Client::HandlerBuilder

#handle, #handle_request, #handle_response

Constructor Details

#initialize(options = {}) ⇒ `Aws::GlueDataBrew::Client`

Constructs an API client.

Options Hash (options):

:access_key_id (String) —
Used to set credentials statically. See Plugins::RequestSigner for more details.
:active_endpoint_cache (Boolean) —
When set to true, a thread polling for endpoints will be running in the background every 60 secs (default). Defaults to false. See Plugins::EndpointDiscovery for more details.
:convert_params (Boolean) — default: true —
When true, an attempt is made to coerce request parameters into the required types. See Plugins::ParamConverter for more details.
:credentials (required, Credentials) —
Your AWS credentials. The following locations will be searched in order for credentials:
- :access_key_id, :secret_access_key, and :session_token options
- ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY']
- HOME/.aws/credentials shared credentials file
- EC2 instance profile credentials See Plugins::RequestSigner for more details.
:disable_host_prefix_injection (Boolean) —
Set to true to disable SDK automatically adding host prefix to default service endpoint when available. See Plugins::EndpointPattern for more details.
:endpoint (String) —
A default endpoint is constructed from the :region. See Plugins::RegionalEndpoint for more details.
:endpoint_cache_max_entries (Integer) —
Used for the maximum size limit of the LRU cache storing endpoints data for endpoint discovery enabled operations. Defaults to 1000. See Plugins::EndpointDiscovery for more details.
:endpoint_cache_max_threads (Integer) —
Used for the maximum threads in use for polling endpoints to be cached, defaults to 10. See Plugins::EndpointDiscovery for more details.
:endpoint_cache_poll_interval (Integer) —
When :endpoint_discovery and :active_endpoint_cache is enabled, Use this option to config the time interval in seconds for making requests fetching endpoints information. Defaults to 60 sec. See Plugins::EndpointDiscovery for more details.
:endpoint_discovery (Boolean) —
When set to true, endpoint discovery will be enabled for operations when available. Defaults to false. See Plugins::EndpointDiscovery for more details.
:http_continue_timeout (Float) — default: 1 —
See Seahorse::Client::Plugins::NetHttp for more details.
:http_idle_timeout (Integer) — default: 5 —
See Seahorse::Client::Plugins::NetHttp for more details.
:http_open_timeout (Integer) — default: 15 —
See Seahorse::Client::Plugins::NetHttp for more details.
:http_proxy (String) —
See Seahorse::Client::Plugins::NetHttp for more details.
:http_read_timeout (Integer) — default: 60 —
See Seahorse::Client::Plugins::NetHttp for more details.
:http_wire_trace (Boolean) — default: false —
See Seahorse::Client::Plugins::NetHttp for more details.
:log_level (Symbol) — default: :info —
The log level to send messages to the logger at. See Plugins::Logging for more details.
:log_formatter (Logging::LogFormatter) —
The log formatter. Defaults to Seahorse::Client::Logging::Formatter.default. See Plugins::Logging for more details.
:logger (Logger) — default: nil —
The Logger instance to send log messages to. If this option is not set, logging will be disabled. See Plugins::Logging for more details.
:profile (String) —
Used when loading credentials from the shared credentials file at HOME/.aws/credentials. When not specified, 'default' is used. See Plugins::RequestSigner for more details.
:raise_response_errors (Boolean) — default: true —
When true, response errors are raised. See Seahorse::Client::Plugins::RaiseResponseErrors for more details.
:region (required, String) —
The AWS region to connect to. The region is used to construct the client endpoint. Defaults to ENV['AWS_REGION']. Also checks AMAZON_REGION and AWS_DEFAULT_REGION. See Plugins::RegionalEndpoint for more details.
:retry_limit (Integer) — default: 3 —
The maximum number of times to retry failed requests. Only ~ 500 level server errors and certain ~ 400 level client errors are retried. Generally, these are throttling errors, data checksum errors, networking errors, timeout errors and auth errors from expired credentials. See Plugins::RetryErrors for more details.
:secret_access_key (String) —
Used to set credentials statically. See Plugins::RequestSigner for more details.
:session_token (String) —
Used to set credentials statically. See Plugins::RequestSigner for more details.
:ssl_ca_bundle (String) —
See Seahorse::Client::Plugins::NetHttp for more details.
:ssl_ca_directory (String) —
See Seahorse::Client::Plugins::NetHttp for more details.
:ssl_ca_store (String) —
See Seahorse::Client::Plugins::NetHttp for more details.
:ssl_verify_peer (Boolean) — default: true —
See Seahorse::Client::Plugins::NetHttp for more details.
:stub_responses (Boolean) — default: false —
Causes the client to return stubbed responses. By default fake responses are generated and returned. You can specify the response data to return or errors to raise by calling ClientStubs#stub_responses. See ClientStubs for more information.

Please note When response stubbing is enabled, no HTTP requests are made, and retries are disabled. See Plugins::StubResponses for more details.
:validate_params (Boolean) — default: true —
When true, request parameters are validated before sending the request. See Plugins::ParamValidator for more details.

Instance Method Details

#batch_delete_recipe_version(options = {}) ⇒ `Types::BatchDeleteRecipeVersionResponse`

Deletes one or more versions of a recipe at a time.

Examples:

Request syntax with placeholder values


resp = client.batch_delete_recipe_version({
  name: "RecipeName", # required
  recipe_versions: ["RecipeVersion"], # required
})

Response structure


resp.name #=> String
resp.errors #=> Array
resp.errors[0].error_code #=> String
resp.errors[0].error_message #=> String
resp.errors[0].recipe_version #=> String

Options Hash (options):

:name (required, String) —
The name of the recipe to be modified.
:recipe_versions (required, Array<String>) —
An array of version identifiers to be deleted.

Returns:

(Types::BatchDeleteRecipeVersionResponse) —
Returns a response object which responds to the following methods:
- #name => String
- #errors => Array<Types::RecipeVersionErrorDetail>

See Also:

AWS API Documentation

#create_dataset(options = {}) ⇒ `Types::CreateDatasetResponse`

Creates a new AWS Glue DataBrew dataset for this AWS account.

Examples:

Request syntax with placeholder values


resp = client.create_dataset({
  name: "DatasetName", # required
  format_options: {
    json: {
      multi_line: false,
    },
    excel: {
      sheet_names: ["SheetName"],
      sheet_indexes: [1],
    },
  },
  input: { # required
    s3_input_definition: {
      bucket: "Bucket", # required
      key: "Key",
    },
    data_catalog_input_definition: {
      catalog_id: "CatalogId",
      database_name: "DatabaseName", # required
      table_name: "TableName", # required
      temp_directory: {
        bucket: "Bucket", # required
        key: "Key",
      },
    },
  },
  tags: {
    "TagKey" => "TagValue",
  },
})

Response structure


resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the dataset to be created.
:format_options (Types::FormatOptions) —
Options that define how Microsoft Excel input is to be interpreted by DataBrew.
:input (required, Types::Input) —
Information on how AWS Glue DataBrew can find data, in either the AWS Glue Data Catalog or Amazon S3.
:tags (Hash<String,String>) —
Metadata tags to apply to this dataset.

Returns:

(Types::CreateDatasetResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#create_profile_job(options = {}) ⇒ `Types::CreateProfileJobResponse`

Creates a new job to profile an AWS Glue DataBrew dataset that exists in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.create_profile_job({
  dataset_name: "DatasetName", # required
  encryption_key_arn: "EncryptionKeyArn",
  encryption_mode: "SSE-KMS", # accepts SSE-KMS, SSE-S3
  name: "JobName", # required
  log_subscription: "ENABLE", # accepts ENABLE, DISABLE
  max_capacity: 1,
  max_retries: 1,
  output_location: { # required
    bucket: "Bucket", # required
    key: "Key",
  },
  role_arn: "Arn", # required
  tags: {
    "TagKey" => "TagValue",
  },
  timeout: 1,
})

Response structure


resp.name #=> String

Options Hash (options):

:dataset_name (required, String) —
The name of the dataset that this job is to act upon.
:encryption_key_arn (String) —
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
:encryption_mode (String) —
The encryption mode for the job, which can be one of the following:
- SSE-KMS - para>SSE-KMS - server-side encryption with AWS KMS-managed keys.
- SSE-S3 - Server-side encryption with keys managed by Amazon S3.
:name (required, String) —
The name of the job to be created.
:log_subscription (String) —
A value that enables or disables Amazon CloudWatch logging for the current AWS account. If logging is enabled, CloudWatch writes one log stream for each job run.
:max_capacity (Integer) —
The maximum number of nodes that DataBrew can use when the job processes data.
:max_retries (Integer) —
The maximum number of times to retry the job after a job run fails.
:output_location (required, Types::S3Location) —
An Amazon S3 location (bucket name an object key) where DataBrew can read input data, or write output from a job.
:role_arn (required, String) —
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed for this request.
:tags (Hash<String,String>) —
Metadata tags to apply to this job.
:timeout (Integer) —
The job\'s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT.

Returns:

(Types::CreateProfileJobResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#create_project(options = {}) ⇒ `Types::CreateProjectResponse`

Creates a new AWS Glue DataBrew project in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.create_project({
  dataset_name: "DatasetName", # required
  name: "ProjectName", # required
  recipe_name: "RecipeName", # required
  sample: {
    size: 1,
    type: "FIRST_N", # required, accepts FIRST_N, LAST_N, RANDOM
  },
  role_arn: "Arn", # required
  tags: {
    "TagKey" => "TagValue",
  },
})

Response structure


resp.name #=> String

Options Hash (options):

:dataset_name (required, String) —
The name of the dataset to associate this project with.
:name (required, String) —
A unique name for the new project.
:recipe_name (required, String) —
The name of an existing recipe to associate with the project.
:sample (Types::Sample) —
Represents the sample size and sampling type for AWS Glue DataBrew to use for interactive data analysis.
:role_arn (required, String) —
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed for this request.
:tags (Hash<String,String>) —
Metadata tags to apply to this project.

Returns:

(Types::CreateProjectResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#create_recipe(options = {}) ⇒ `Types::CreateRecipeResponse`

Creates a new AWS Glue DataBrew recipe for the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.create_recipe({
  description: "RecipeDescription",
  name: "RecipeName", # required
  steps: [ # required
    {
      action: { # required
        operation: "Operation", # required
        parameters: {
          "ParameterName" => "ParameterValue",
        },
      },
      condition_expressions: [
        {
          condition: "Condition", # required
          value: "ConditionValue",
          target_column: "TargetColumn", # required
        },
      ],
    },
  ],
  tags: {
    "TagKey" => "TagValue",
  },
})

Response structure


resp.name #=> String

Options Hash (options):

:description (String) —
A description for the recipe.
:name (required, String) —
A unique name for the recipe.
:steps (required, Array<Types::RecipeStep>) —
An array containing the steps to be performed by the recipe. Each recipe step consists of one recipe action and (optionally) an array of condition expressions.
:tags (Hash<String,String>) —
Metadata tags to apply to this recipe.

Returns:

(Types::CreateRecipeResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#create_recipe_job(options = {}) ⇒ `Types::CreateRecipeJobResponse`

Creates a new job for an existing AWS Glue DataBrew recipe in the current AWS account. You can create a standalone job using either a project, or a combination of a recipe and a dataset.

Examples:

Request syntax with placeholder values


resp = client.create_recipe_job({
  dataset_name: "DatasetName",
  encryption_key_arn: "EncryptionKeyArn",
  encryption_mode: "SSE-KMS", # accepts SSE-KMS, SSE-S3
  name: "JobName", # required
  log_subscription: "ENABLE", # accepts ENABLE, DISABLE
  max_capacity: 1,
  max_retries: 1,
  outputs: [ # required
    {
      compression_format: "GZIP", # accepts GZIP, LZ4, SNAPPY, BZIP2, DEFLATE, LZO, BROTLI, ZSTD, ZLIB
      format: "CSV", # accepts CSV, JSON, PARQUET, GLUEPARQUET, AVRO, ORC, XML
      partition_columns: ["ColumnName"],
      location: { # required
        bucket: "Bucket", # required
        key: "Key",
      },
      overwrite: false,
    },
  ],
  project_name: "ProjectName",
  recipe_reference: {
    name: "RecipeName", # required
    recipe_version: "RecipeVersion",
  },
  role_arn: "Arn", # required
  tags: {
    "TagKey" => "TagValue",
  },
  timeout: 1,
})

Response structure


resp.name #=> String

Options Hash (options):

:dataset_name (String) —
The name of the dataset that this job processes.
:encryption_key_arn (String) —
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
:encryption_mode (String) —
The encryption mode for the job, which can be one of the following:
- SSE-KMS - Server-side encryption with AWS KMS-managed keys.
- SSE-S3 - Server-side encryption with keys managed by Amazon S3.
:name (required, String) —
A unique name for the job.
:log_subscription (String) —
A value that enables or disables Amazon CloudWatch logging for the current AWS account. If logging is enabled, CloudWatch writes one log stream for each job run.
:max_capacity (Integer) —
The maximum number of nodes that DataBrew can consume when the job processes data.
:max_retries (Integer) —
The maximum number of times to retry the job after a job run fails.
:outputs (required, Array<Types::Output>) —
One or more artifacts that represent the output from running the job.
:project_name (String) —
Either the name of an existing project, or a combination of a recipe and a dataset to associate with the recipe.
:recipe_reference (Types::RecipeReference) —
Represents all of the attributes of an AWS Glue DataBrew recipe.
:role_arn (required, String) —
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed for this request.
:tags (Hash<String,String>) —
Metadata tags to apply to this job dataset.
:timeout (Integer) —
The job\'s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT.

Returns:

(Types::CreateRecipeJobResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#create_schedule(options = {}) ⇒ `Types::CreateScheduleResponse`

Creates a new schedule for one or more AWS Glue DataBrew jobs. Jobs can be run at a specific date and time, or at regular intervals.

Examples:

Request syntax with placeholder values


resp = client.create_schedule({
  job_names: ["JobName"],
  cron_expression: "CronExpression", # required
  tags: {
    "TagKey" => "TagValue",
  },
  name: "ScheduleName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:job_names (Array<String>) —
The name or names of one or more jobs to be run.
:cron_expression (required, String) —
The date or dates and time or times, in cron format, when the jobs are to be run.
:tags (Hash<String,String>) —
Metadata tags to apply to this schedule.
:name (required, String) —
A unique name for the schedule.

Returns:

(Types::CreateScheduleResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#delete_dataset(options = {}) ⇒ `Types::DeleteDatasetResponse`

Deletes a dataset from AWS Glue DataBrew.

Examples:

Request syntax with placeholder values


resp = client.delete_dataset({
  name: "DatasetName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the dataset to be deleted.

Returns:

(Types::DeleteDatasetResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#delete_job(options = {}) ⇒ `Types::DeleteJobResponse`

Deletes the specified AWS Glue DataBrew job from the current AWS account. The job can be for a recipe or for a profile.

Examples:

Request syntax with placeholder values


resp = client.delete_job({
  name: "JobName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the job to be deleted.

Returns:

(Types::DeleteJobResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#delete_project(options = {}) ⇒ `Types::DeleteProjectResponse`

Deletes an existing AWS Glue DataBrew project from the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.delete_project({
  name: "ProjectName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the project to be deleted.

Returns:

(Types::DeleteProjectResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#delete_recipe_version(options = {}) ⇒ `Types::DeleteRecipeVersionResponse`

Deletes a single version of an AWS Glue DataBrew recipe.

Examples:

Request syntax with placeholder values


resp = client.delete_recipe_version({
  name: "RecipeName", # required
  recipe_version: "RecipeVersion", # required
})

Response structure


resp.name #=> String
resp.recipe_version #=> String

Options Hash (options):

:name (required, String) —
The name of the recipe to be deleted.
:recipe_version (required, String) —
The version of the recipe to be deleted.

Returns:

(Types::DeleteRecipeVersionResponse) —
Returns a response object which responds to the following methods:
- #name => String
- #recipe_version => String

See Also:

AWS API Documentation

#delete_schedule(options = {}) ⇒ `Types::DeleteScheduleResponse`

Deletes the specified AWS Glue DataBrew schedule from the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.delete_schedule({
  name: "ScheduleName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the schedule to be deleted.

Returns:

(Types::DeleteScheduleResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#describe_dataset(options = {}) ⇒ `Types::DescribeDatasetResponse`

Returns the definition of a specific AWS Glue DataBrew dataset that is in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.describe_dataset({
  name: "DatasetName", # required
})

Response structure


resp.created_by #=> String
resp.create_date #=> Time
resp.name #=> String
resp.format_options.json.multi_line #=> true/false
resp.format_options.excel.sheet_names #=> Array
resp.format_options.excel.sheet_names[0] #=> String
resp.format_options.excel.sheet_indexes #=> Array
resp.format_options.excel.sheet_indexes[0] #=> Integer
resp.input.s3_input_definition.bucket #=> String
resp.input.s3_input_definition.key #=> String
resp.input.data_catalog_input_definition.catalog_id #=> String
resp.input.data_catalog_input_definition.database_name #=> String
resp.input.data_catalog_input_definition.table_name #=> String
resp.input.data_catalog_input_definition.temp_directory.bucket #=> String
resp.input.data_catalog_input_definition.temp_directory.key #=> String
resp.last_modified_date #=> Time
resp.last_modified_by #=> String
resp.source #=> String, one of "S3", "DATA-CATALOG"
resp.tags #=> Hash
resp.tags["TagKey"] #=> String
resp.resource_arn #=> String

Options Hash (options):

:name (required, String) —
The name of the dataset to be described.

Returns:

(Types::DescribeDatasetResponse) —
Returns a response object which responds to the following methods:
- #created_by => String
- #create_date => Time
- #name => String
- #format_options => Types::FormatOptions
- #input => Types::Input
- #last_modified_date => Time
- #last_modified_by => String
- #source => String
- #tags => Hash<String,String>
- #resource_arn => String

See Also:

AWS API Documentation

#describe_job(options = {}) ⇒ `Types::DescribeJobResponse`

Returns the definition of a specific AWS Glue DataBrew job that is in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.describe_job({
  name: "JobName", # required
})

Response structure


resp.create_date #=> Time
resp.created_by #=> String
resp.dataset_name #=> String
resp.encryption_key_arn #=> String
resp.encryption_mode #=> String, one of "SSE-KMS", "SSE-S3"
resp.name #=> String
resp.type #=> String, one of "PROFILE", "RECIPE"
resp.last_modified_by #=> String
resp.last_modified_date #=> Time
resp.log_subscription #=> String, one of "ENABLE", "DISABLE"
resp.max_capacity #=> Integer
resp.max_retries #=> Integer
resp.outputs #=> Array
resp.outputs[0].compression_format #=> String, one of "GZIP", "LZ4", "SNAPPY", "BZIP2", "DEFLATE", "LZO", "BROTLI", "ZSTD", "ZLIB"
resp.outputs[0].format #=> String, one of "CSV", "JSON", "PARQUET", "GLUEPARQUET", "AVRO", "ORC", "XML"
resp.outputs[0].partition_columns #=> Array
resp.outputs[0].partition_columns[0] #=> String
resp.outputs[0].location.bucket #=> String
resp.outputs[0].location.key #=> String
resp.outputs[0].overwrite #=> true/false
resp.project_name #=> String
resp.recipe_reference.name #=> String
resp.recipe_reference.recipe_version #=> String
resp.resource_arn #=> String
resp.role_arn #=> String
resp.tags #=> Hash
resp.tags["TagKey"] #=> String
resp.timeout #=> Integer

Options Hash (options):

:name (required, String) —
The name of the job to be described.

Returns:

(Types::DescribeJobResponse) —
Returns a response object which responds to the following methods:
- #create_date => Time
- #created_by => String
- #dataset_name => String
- #encryption_key_arn => String
- #encryption_mode => String
- #name => String
- #type => String
- #last_modified_by => String
- #last_modified_date => Time
- #log_subscription => String
- #max_capacity => Integer
- #max_retries => Integer
- #outputs => Array<Types::Output>
- #project_name => String
- #recipe_reference => Types::RecipeReference
- #resource_arn => String
- #role_arn => String
- #tags => Hash<String,String>
- #timeout => Integer

See Also:

AWS API Documentation

#describe_project(options = {}) ⇒ `Types::DescribeProjectResponse`

Returns the definition of a specific AWS Glue DataBrew project that is in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.describe_project({
  name: "ProjectName", # required
})

Response structure


resp.create_date #=> Time
resp.created_by #=> String
resp.dataset_name #=> String
resp.last_modified_date #=> Time
resp.last_modified_by #=> String
resp.name #=> String
resp.recipe_name #=> String
resp.resource_arn #=> String
resp.sample.size #=> Integer
resp.sample.type #=> String, one of "FIRST_N", "LAST_N", "RANDOM"
resp.role_arn #=> String
resp.tags #=> Hash
resp.tags["TagKey"] #=> String
resp.session_status #=> String, one of "ASSIGNED", "FAILED", "INITIALIZING", "PROVISIONING", "READY", "RECYCLING", "ROTATING", "TERMINATED", "TERMINATING", "UPDATING"
resp.opened_by #=> String
resp.open_date #=> Time

Options Hash (options):

:name (required, String) —
The name of the project to be described.

Returns:

(Types::DescribeProjectResponse) —
Returns a response object which responds to the following methods:
- #create_date => Time
- #created_by => String
- #dataset_name => String
- #last_modified_date => Time
- #last_modified_by => String
- #name => String
- #recipe_name => String
- #resource_arn => String
- #sample => Types::Sample
- #role_arn => String
- #tags => Hash<String,String>
- #session_status => String
- #opened_by => String
- #open_date => Time

See Also:

AWS API Documentation

#describe_recipe(options = {}) ⇒ `Types::DescribeRecipeResponse`

Returns the definition of a specific AWS Glue DataBrew recipe that is in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.describe_recipe({
  name: "RecipeName", # required
  recipe_version: "RecipeVersion",
})

Response structure


resp.created_by #=> String
resp.create_date #=> Time
resp.last_modified_by #=> String
resp.last_modified_date #=> Time
resp.project_name #=> String
resp.published_by #=> String
resp.published_date #=> Time
resp.description #=> String
resp.name #=> String
resp.steps #=> Array
resp.steps[0].action.operation #=> String
resp.steps[0].action.parameters #=> Hash
resp.steps[0].action.parameters["ParameterName"] #=> String
resp.steps[0].condition_expressions #=> Array
resp.steps[0].condition_expressions[0].condition #=> String
resp.steps[0].condition_expressions[0].value #=> String
resp.steps[0].condition_expressions[0].target_column #=> String
resp.tags #=> Hash
resp.tags["TagKey"] #=> String
resp.resource_arn #=> String
resp.recipe_version #=> String

Options Hash (options):

:name (required, String) —
The name of the recipe to be described.
:recipe_version (String) —
The recipe version identifier. If this parameter isn\'t specified, then the latest published version is returned.

Returns:

(Types::DescribeRecipeResponse) —
Returns a response object which responds to the following methods:
- #created_by => String
- #create_date => Time
- #last_modified_by => String
- #last_modified_date => Time
- #project_name => String
- #published_by => String
- #published_date => Time
- #description => String
- #name => String
- #steps => Array<Types::RecipeStep>
- #tags => Hash<String,String>
- #resource_arn => String
- #recipe_version => String

See Also:

AWS API Documentation

#describe_schedule(options = {}) ⇒ `Types::DescribeScheduleResponse`

Returns the definition of a specific AWS Glue DataBrew schedule that is in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.describe_schedule({
  name: "ScheduleName", # required
})

Response structure


resp.create_date #=> Time
resp.created_by #=> String
resp.job_names #=> Array
resp.job_names[0] #=> String
resp.last_modified_by #=> String
resp.last_modified_date #=> Time
resp.resource_arn #=> String
resp.cron_expression #=> String
resp.tags #=> Hash
resp.tags["TagKey"] #=> String
resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the schedule to be described.

Returns:

(Types::DescribeScheduleResponse) —
Returns a response object which responds to the following methods:
- #create_date => Time
- #created_by => String
- #job_names => Array<String>
- #last_modified_by => String
- #last_modified_date => Time
- #resource_arn => String
- #cron_expression => String
- #tags => Hash<String,String>
- #name => String

See Also:

AWS API Documentation

#list_datasets(options = {}) ⇒ `Types::ListDatasetsResponse`

Lists all of the AWS Glue DataBrew datasets for the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_datasets({
  max_results: 1,
  next_token: "NextToken",
})

Response structure


resp.datasets #=> Array
resp.datasets[0].account_id #=> String
resp.datasets[0].created_by #=> String
resp.datasets[0].create_date #=> Time
resp.datasets[0].name #=> String
resp.datasets[0].format_options.json.multi_line #=> true/false
resp.datasets[0].format_options.excel.sheet_names #=> Array
resp.datasets[0].format_options.excel.sheet_names[0] #=> String
resp.datasets[0].format_options.excel.sheet_indexes #=> Array
resp.datasets[0].format_options.excel.sheet_indexes[0] #=> Integer
resp.datasets[0].input.s3_input_definition.bucket #=> String
resp.datasets[0].input.s3_input_definition.key #=> String
resp.datasets[0].input.data_catalog_input_definition.catalog_id #=> String
resp.datasets[0].input.data_catalog_input_definition.database_name #=> String
resp.datasets[0].input.data_catalog_input_definition.table_name #=> String
resp.datasets[0].input.data_catalog_input_definition.temp_directory.bucket #=> String
resp.datasets[0].input.data_catalog_input_definition.temp_directory.key #=> String
resp.datasets[0].last_modified_date #=> Time
resp.datasets[0].last_modified_by #=> String
resp.datasets[0].source #=> String, one of "S3", "DATA-CATALOG"
resp.datasets[0].tags #=> Hash
resp.datasets[0].tags["TagKey"] #=> String
resp.datasets[0].resource_arn #=> String
resp.next_token #=> String

Options Hash (options):

:max_results (Integer) —
The maximum number of results to return in this request.
:next_token (String) —
A token generated by DataBrew that specifies where to continue pagination if a previous request was truncated. To get the next set of pages, pass in the NextToken value from the response object of the previous page call.

Returns:

(Types::ListDatasetsResponse) —
Returns a response object which responds to the following methods:
- #datasets => Array<Types::Dataset>
- #next_token => String

See Also:

AWS API Documentation

#list_job_runs(options = {}) ⇒ `Types::ListJobRunsResponse`

Lists all of the previous runs of a particular AWS Glue DataBrew job in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_job_runs({
  name: "JobName", # required
  max_results: 1,
  next_token: "NextToken",
})

Response structure


resp.job_runs #=> Array
resp.job_runs[0].attempt #=> Integer
resp.job_runs[0].completed_on #=> Time
resp.job_runs[0].dataset_name #=> String
resp.job_runs[0].error_message #=> String
resp.job_runs[0].execution_time #=> Integer
resp.job_runs[0].job_name #=> String
resp.job_runs[0].run_id #=> String
resp.job_runs[0].state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.job_runs[0].log_subscription #=> String, one of "ENABLE", "DISABLE"
resp.job_runs[0].log_group_name #=> String
resp.job_runs[0].outputs #=> Array
resp.job_runs[0].outputs[0].compression_format #=> String, one of "GZIP", "LZ4", "SNAPPY", "BZIP2", "DEFLATE", "LZO", "BROTLI", "ZSTD", "ZLIB"
resp.job_runs[0].outputs[0].format #=> String, one of "CSV", "JSON", "PARQUET", "GLUEPARQUET", "AVRO", "ORC", "XML"
resp.job_runs[0].outputs[0].partition_columns #=> Array
resp.job_runs[0].outputs[0].partition_columns[0] #=> String
resp.job_runs[0].outputs[0].location.bucket #=> String
resp.job_runs[0].outputs[0].location.key #=> String
resp.job_runs[0].outputs[0].overwrite #=> true/false
resp.job_runs[0].recipe_reference.name #=> String
resp.job_runs[0].recipe_reference.recipe_version #=> String
resp.job_runs[0].started_by #=> String
resp.job_runs[0].started_on #=> Time
resp.next_token #=> String

Options Hash (options):

:name (required, String) —
The name of the job.
:max_results (Integer) —
The maximum number of results to return in this request.
:next_token (String) —
A token generated by AWS Glue DataBrew that specifies where to continue pagination if a previous request was truncated. To get the next set of pages, pass in the NextToken value from the response object of the previous page call.

Returns:

(Types::ListJobRunsResponse) —
Returns a response object which responds to the following methods:
- #job_runs => Array<Types::JobRun>
- #next_token => String

See Also:

AWS API Documentation

#list_jobs(options = {}) ⇒ `Types::ListJobsResponse`

Lists the AWS Glue DataBrew jobs in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_jobs({
  dataset_name: "DatasetName",
  max_results: 1,
  next_token: "NextToken",
  project_name: "ProjectName",
})

Response structure


resp.jobs #=> Array
resp.jobs[0].account_id #=> String
resp.jobs[0].created_by #=> String
resp.jobs[0].create_date #=> Time
resp.jobs[0].dataset_name #=> String
resp.jobs[0].encryption_key_arn #=> String
resp.jobs[0].encryption_mode #=> String, one of "SSE-KMS", "SSE-S3"
resp.jobs[0].name #=> String
resp.jobs[0].type #=> String, one of "PROFILE", "RECIPE"
resp.jobs[0].last_modified_by #=> String
resp.jobs[0].last_modified_date #=> Time
resp.jobs[0].log_subscription #=> String, one of "ENABLE", "DISABLE"
resp.jobs[0].max_capacity #=> Integer
resp.jobs[0].max_retries #=> Integer
resp.jobs[0].outputs #=> Array
resp.jobs[0].outputs[0].compression_format #=> String, one of "GZIP", "LZ4", "SNAPPY", "BZIP2", "DEFLATE", "LZO", "BROTLI", "ZSTD", "ZLIB"
resp.jobs[0].outputs[0].format #=> String, one of "CSV", "JSON", "PARQUET", "GLUEPARQUET", "AVRO", "ORC", "XML"
resp.jobs[0].outputs[0].partition_columns #=> Array
resp.jobs[0].outputs[0].partition_columns[0] #=> String
resp.jobs[0].outputs[0].location.bucket #=> String
resp.jobs[0].outputs[0].location.key #=> String
resp.jobs[0].outputs[0].overwrite #=> true/false
resp.jobs[0].project_name #=> String
resp.jobs[0].recipe_reference.name #=> String
resp.jobs[0].recipe_reference.recipe_version #=> String
resp.jobs[0].resource_arn #=> String
resp.jobs[0].role_arn #=> String
resp.jobs[0].timeout #=> Integer
resp.jobs[0].tags #=> Hash
resp.jobs[0].tags["TagKey"] #=> String
resp.next_token #=> String

Options Hash (options):

:dataset_name (String) —
The name of a dataset. Using this parameter indicates to return only those jobs that act on the specified dataset.
:max_results (Integer) —
The maximum number of results to return in this request.
:next_token (String) —
A token generated by DataBrew that specifies where to continue pagination if a previous request was truncated. To get the next set of pages, pass in the NextToken value from the response object of the previous page call.
:project_name (String) —
The name of a project. Using this parameter indicates to return only those jobs that are associated with the specified project.

Returns:

(Types::ListJobsResponse) —
Returns a response object which responds to the following methods:
- #jobs => Array<Types::Job>
- #next_token => String

See Also:

AWS API Documentation

#list_projects(options = {}) ⇒ `Types::ListProjectsResponse`

Lists all of the DataBrew projects in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_projects({
  next_token: "NextToken",
  max_results: 1,
})

Response structure


resp.projects #=> Array
resp.projects[0].account_id #=> String
resp.projects[0].create_date #=> Time
resp.projects[0].created_by #=> String
resp.projects[0].dataset_name #=> String
resp.projects[0].last_modified_date #=> Time
resp.projects[0].last_modified_by #=> String
resp.projects[0].name #=> String
resp.projects[0].recipe_name #=> String
resp.projects[0].resource_arn #=> String
resp.projects[0].sample.size #=> Integer
resp.projects[0].sample.type #=> String, one of "FIRST_N", "LAST_N", "RANDOM"
resp.projects[0].tags #=> Hash
resp.projects[0].tags["TagKey"] #=> String
resp.projects[0].role_arn #=> String
resp.projects[0].opened_by #=> String
resp.projects[0].open_date #=> Time
resp.next_token #=> String

Options Hash (options):

:next_token (String) —
A pagination token that can be used in a subsequent request.
:max_results (Integer) —
The maximum number of results to return in this request.

Returns:

(Types::ListProjectsResponse) —
Returns a response object which responds to the following methods:
- #projects => Array<Types::Project>
- #next_token => String

See Also:

AWS API Documentation

#list_recipe_versions(options = {}) ⇒ `Types::ListRecipeVersionsResponse`

Lists all of the versions of a particular AWS Glue DataBrew recipe in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_recipe_versions({
  max_results: 1,
  next_token: "NextToken",
  name: "RecipeName", # required
})

Response structure


resp.next_token #=> String
resp.recipes #=> Array
resp.recipes[0].created_by #=> String
resp.recipes[0].create_date #=> Time
resp.recipes[0].last_modified_by #=> String
resp.recipes[0].last_modified_date #=> Time
resp.recipes[0].project_name #=> String
resp.recipes[0].published_by #=> String
resp.recipes[0].published_date #=> Time
resp.recipes[0].description #=> String
resp.recipes[0].name #=> String
resp.recipes[0].resource_arn #=> String
resp.recipes[0].steps #=> Array
resp.recipes[0].steps[0].action.operation #=> String
resp.recipes[0].steps[0].action.parameters #=> Hash
resp.recipes[0].steps[0].action.parameters["ParameterName"] #=> String
resp.recipes[0].steps[0].condition_expressions #=> Array
resp.recipes[0].steps[0].condition_expressions[0].condition #=> String
resp.recipes[0].steps[0].condition_expressions[0].value #=> String
resp.recipes[0].steps[0].condition_expressions[0].target_column #=> String
resp.recipes[0].tags #=> Hash
resp.recipes[0].tags["TagKey"] #=> String
resp.recipes[0].recipe_version #=> String

Options Hash (options):

:max_results (Integer) —
The maximum number of results to return in this request.
:next_token (String) —
A pagination token that can be used in a subsequent request.
:name (required, String) —
The name of the recipe for which to return version information.

Returns:

(Types::ListRecipeVersionsResponse) —
Returns a response object which responds to the following methods:
- #next_token => String
- #recipes => Array<Types::Recipe>

See Also:

AWS API Documentation

#list_recipes(options = {}) ⇒ `Types::ListRecipesResponse`

Lists all of the AWS Glue DataBrew recipes in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_recipes({
  max_results: 1,
  next_token: "NextToken",
  recipe_version: "RecipeVersion",
})

Response structure


resp.recipes #=> Array
resp.recipes[0].created_by #=> String
resp.recipes[0].create_date #=> Time
resp.recipes[0].last_modified_by #=> String
resp.recipes[0].last_modified_date #=> Time
resp.recipes[0].project_name #=> String
resp.recipes[0].published_by #=> String
resp.recipes[0].published_date #=> Time
resp.recipes[0].description #=> String
resp.recipes[0].name #=> String
resp.recipes[0].resource_arn #=> String
resp.recipes[0].steps #=> Array
resp.recipes[0].steps[0].action.operation #=> String
resp.recipes[0].steps[0].action.parameters #=> Hash
resp.recipes[0].steps[0].action.parameters["ParameterName"] #=> String
resp.recipes[0].steps[0].condition_expressions #=> Array
resp.recipes[0].steps[0].condition_expressions[0].condition #=> String
resp.recipes[0].steps[0].condition_expressions[0].value #=> String
resp.recipes[0].steps[0].condition_expressions[0].target_column #=> String
resp.recipes[0].tags #=> Hash
resp.recipes[0].tags["TagKey"] #=> String
resp.recipes[0].recipe_version #=> String
resp.next_token #=> String

Options Hash (options):

:max_results (Integer) —
The maximum number of results to return in this request.
:next_token (String) —
A pagination token that can be used in a subsequent request.
:recipe_version (String) —
A version identifier. Using this parameter indicates to return only those recipes that have this version identifier.

Returns:

(Types::ListRecipesResponse) —
Returns a response object which responds to the following methods:
- #recipes => Array<Types::Recipe>
- #next_token => String

See Also:

AWS API Documentation

#list_schedules(options = {}) ⇒ `Types::ListSchedulesResponse`

Lists the AWS Glue DataBrew schedules in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.list_schedules({
  job_name: "JobName",
  max_results: 1,
  next_token: "NextToken",
})

Response structure


resp.schedules #=> Array
resp.schedules[0].account_id #=> String
resp.schedules[0].created_by #=> String
resp.schedules[0].create_date #=> Time
resp.schedules[0].job_names #=> Array
resp.schedules[0].job_names[0] #=> String
resp.schedules[0].last_modified_by #=> String
resp.schedules[0].last_modified_date #=> Time
resp.schedules[0].resource_arn #=> String
resp.schedules[0].cron_expression #=> String
resp.schedules[0].tags #=> Hash
resp.schedules[0].tags["TagKey"] #=> String
resp.schedules[0].name #=> String
resp.next_token #=> String

Options Hash (options):

:job_name (String) —
The name of the job that these schedules apply to.
:max_results (Integer) —
The maximum number of results to return in this request.
:next_token (String) —
A pagination token that can be used in a subsequent request.

Returns:

(Types::ListSchedulesResponse) —
Returns a response object which responds to the following methods:
- #schedules => Array<Types::Schedule>
- #next_token => String

See Also:

AWS API Documentation

#list_tags_for_resource(options = {}) ⇒ `Types::ListTagsForResourceResponse`

Lists all the tags for an AWS Glue DataBrew resource.

Examples:

Request syntax with placeholder values


resp = client.list_tags_for_resource({
  resource_arn: "Arn", # required
})

Response structure


resp.tags #=> Hash
resp.tags["TagKey"] #=> String

Options Hash (options):

:resource_arn (required, String) —
The Amazon Resource Name (ARN) string that uniquely identifies the DataBrew resource.

Returns:

(Types::ListTagsForResourceResponse) —
Returns a response object which responds to the following methods:
- #tags => Hash<String,String>

See Also:

AWS API Documentation

#publish_recipe(options = {}) ⇒ `Types::PublishRecipeResponse`

Publishes a new major version of an AWS Glue DataBrew recipe that exists in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.publish_recipe({
  description: "RecipeDescription",
  name: "RecipeName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:description (String) —
A description of the recipe to be published, for this version of the recipe.
:name (required, String) —
The name of the recipe to be published.

Returns:

(Types::PublishRecipeResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#send_project_session_action(options = {}) ⇒ `Types::SendProjectSessionActionResponse`

Performs a recipe step within an interactive AWS Glue DataBrew session that's currently open.

Examples:

Request syntax with placeholder values


resp = client.send_project_session_action({
  preview: false,
  name: "ProjectName", # required
  recipe_step: {
    action: { # required
      operation: "Operation", # required
      parameters: {
        "ParameterName" => "ParameterValue",
      },
    },
    condition_expressions: [
      {
        condition: "Condition", # required
        value: "ConditionValue",
        target_column: "TargetColumn", # required
      },
    ],
  },
  step_index: 1,
  client_session_id: "ClientSessionId",
  view_frame: {
    start_column_index: 1, # required
    column_range: 1,
    hidden_columns: ["ColumnName"],
  },
})

Response structure


resp.result #=> String
resp.name #=> String
resp.action_id #=> Integer

Options Hash (options):

:preview (Boolean) —
Returns the result of the recipe step, without applying it. The result isn\'t added to the view frame stack.
:name (required, String) —
The name of the project to apply the action to.
:recipe_step (Types::RecipeStep) —
Represents a single step to be performed in an AWS Glue DataBrew recipe.
:step_index (Integer) —
The index from which to preview a step. This index is used to preview the result of steps that have already been applied, so that the resulting view frame is from earlier in the view frame stack.
:client_session_id (String) —
A unique identifier for an interactive session that\'s currently open and ready for work. The action will be performed on this session.
:view_frame (Types::ViewFrame) —
Represents the data being being transformed during an AWS Glue DataBrew project session.

Returns:

(Types::SendProjectSessionActionResponse) —
Returns a response object which responds to the following methods:
- #result => String
- #name => String
- #action_id => Integer

See Also:

AWS API Documentation

#start_job_run(options = {}) ⇒ `Types::StartJobRunResponse`

Runs an AWS Glue DataBrew job that exists in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.start_job_run({
  name: "JobName", # required
})

Response structure


resp.run_id #=> String

Options Hash (options):

:name (required, String) —
The name of the job to be run.

Returns:

(Types::StartJobRunResponse) —
Returns a response object which responds to the following methods:
- #run_id => String

See Also:

AWS API Documentation

#start_project_session(options = {}) ⇒ `Types::StartProjectSessionResponse`

Creates an interactive session, enabling you to manipulate an AWS Glue DataBrew project.

Examples:

Request syntax with placeholder values


resp = client.start_project_session({
  name: "ProjectName", # required
  assume_control: false,
})

Response structure


resp.name #=> String
resp.client_session_id #=> String

Options Hash (options):

:name (required, String) —
The name of the project to act upon.
:assume_control (Boolean) —
A value that, if true, enables you to take control of a session, even if a different client is currently accessing the project.

Returns:

(Types::StartProjectSessionResponse) —
Returns a response object which responds to the following methods:
- #name => String
- #client_session_id => String

See Also:

AWS API Documentation

#stop_job_run(options = {}) ⇒ `Types::StopJobRunResponse`

Stops the specified job from running in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.stop_job_run({
  name: "JobName", # required
  run_id: "JobRunId", # required
})

Response structure


resp.run_id #=> String

Options Hash (options):

:name (required, String) —
The name of the job to be stopped.
:run_id (required, String) —
The ID of the job run to be stopped.

Returns:

(Types::StopJobRunResponse) —
Returns a response object which responds to the following methods:
- #run_id => String

See Also:

AWS API Documentation

#tag_resource(options = {}) ⇒ `Struct`

Adds metadata tags to an AWS Glue DataBrew resource, such as a dataset, job, project, or recipe.

Examples:

Request syntax with placeholder values


resp = client.tag_resource({
  resource_arn: "Arn", # required
  tags: { # required
    "TagKey" => "TagValue",
  },
})

Options Hash (options):

:resource_arn (required, String) —
The DataBrew resource to which tags should be added. The value for this parameter is an Amazon Resource Name (ARN). For DataBrew, you can tag a dataset, a job, a project, or a recipe.
:tags (required, Hash<String,String>) —
One or more tags to be assigned to the resource.

Returns:

(Struct) —
Returns an empty response.

See Also:

AWS API Documentation

#untag_resource(options = {}) ⇒ `Struct`

Removes metadata tags from an AWS Glue DataBrew resource.

Examples:

Request syntax with placeholder values


resp = client.untag_resource({
  resource_arn: "Arn", # required
  tag_keys: ["TagKey"], # required
})

Options Hash (options):

:resource_arn (required, String) —
An DataBrew resource from which you want to remove a tag or tags. The value for this parameter is an Amazon Resource Name (ARN).
:tag_keys (required, Array<String>) —
The tag keys (names) of one or more tags to be removed.

Returns:

(Struct) —
Returns an empty response.

See Also:

AWS API Documentation

#update_dataset(options = {}) ⇒ `Types::UpdateDatasetResponse`

Modifies the definition of an existing AWS Glue DataBrew dataset in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.update_dataset({
  name: "DatasetName", # required
  format_options: {
    json: {
      multi_line: false,
    },
    excel: {
      sheet_names: ["SheetName"],
      sheet_indexes: [1],
    },
  },
  input: { # required
    s3_input_definition: {
      bucket: "Bucket", # required
      key: "Key",
    },
    data_catalog_input_definition: {
      catalog_id: "CatalogId",
      database_name: "DatabaseName", # required
      table_name: "TableName", # required
      temp_directory: {
        bucket: "Bucket", # required
        key: "Key",
      },
    },
  },
})

Response structure


resp.name #=> String

Options Hash (options):

:name (required, String) —
The name of the dataset to be updated.
:format_options (Types::FormatOptions) —
Options that define how Microsoft Excel input is to be interpreted by DataBrew.
:input (required, Types::Input) —
Information on how AWS Glue DataBrew can find data, in either the AWS Glue Data Catalog or Amazon S3.

Returns:

(Types::UpdateDatasetResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#update_profile_job(options = {}) ⇒ `Types::UpdateProfileJobResponse`

Modifies the definition of an existing AWS Glue DataBrew job in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.update_profile_job({
  encryption_key_arn: "EncryptionKeyArn",
  encryption_mode: "SSE-KMS", # accepts SSE-KMS, SSE-S3
  name: "JobName", # required
  log_subscription: "ENABLE", # accepts ENABLE, DISABLE
  max_capacity: 1,
  max_retries: 1,
  output_location: { # required
    bucket: "Bucket", # required
    key: "Key",
  },
  role_arn: "Arn", # required
  timeout: 1,
})

Response structure


resp.name #=> String

Options Hash (options):

:encryption_key_arn (String) —
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
:encryption_mode (String) —
The encryption mode for the job, which can be one of the following:
- SSE-KMS - Server-side encryption with AWS KMS-managed keys.
- SSE-S3 - Server-side encryption with keys managed by Amazon S3.
:name (required, String) —
The name of the job to be updated.
:log_subscription (String) —
A value that enables or disables Amazon CloudWatch logging for the current AWS account. If logging is enabled, CloudWatch writes one log stream for each job run.
:max_capacity (Integer) —
The maximum number of nodes that DataBrew can use when the job processes data.
:max_retries (Integer) —
The maximum number of times to retry the job after a job run fails.
:output_location (required, Types::S3Location) —
An Amazon S3 location (bucket name an object key) where DataBrew can read input data, or write output from a job.
:role_arn (required, String) —
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed for this request.
:timeout (Integer) —
The job\'s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT.

Returns:

(Types::UpdateProfileJobResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#update_project(options = {}) ⇒ `Types::UpdateProjectResponse`

Modifies the definition of an existing AWS Glue DataBrew project in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.update_project({
  sample: {
    size: 1,
    type: "FIRST_N", # required, accepts FIRST_N, LAST_N, RANDOM
  },
  role_arn: "Arn", # required
  name: "ProjectName", # required
})

Response structure


resp.last_modified_date #=> Time
resp.name #=> String

Options Hash (options):

:sample (Types::Sample) —
Represents the sample size and sampling type for AWS Glue DataBrew to use for interactive data analysis.
:role_arn (required, String) —
The Amazon Resource Name (ARN) of the IAM role to be assumed for this request.
:name (required, String) —
The name of the project to be updated.

Returns:

(Types::UpdateProjectResponse) —
Returns a response object which responds to the following methods:
- #last_modified_date => Time
- #name => String

See Also:

AWS API Documentation

#update_recipe(options = {}) ⇒ `Types::UpdateRecipeResponse`

Modifies the definition of the latest working version of an AWS Glue DataBrew recipe in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.update_recipe({
  description: "RecipeDescription",
  name: "RecipeName", # required
  steps: [
    {
      action: { # required
        operation: "Operation", # required
        parameters: {
          "ParameterName" => "ParameterValue",
        },
      },
      condition_expressions: [
        {
          condition: "Condition", # required
          value: "ConditionValue",
          target_column: "TargetColumn", # required
        },
      ],
    },
  ],
})

Response structure


resp.name #=> String

Options Hash (options):

:description (String) —
A description of the recipe.
:name (required, String) —
The name of the recipe to be updated.
:steps (Array<Types::RecipeStep>) —
One or more steps to be performed by the recipe. Each step consists of an action, and the conditions under which the action should succeed.

Returns:

(Types::UpdateRecipeResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#update_recipe_job(options = {}) ⇒ `Types::UpdateRecipeJobResponse`

Modifies the definition of an existing AWS Glue DataBrew recipe job in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.update_recipe_job({
  encryption_key_arn: "EncryptionKeyArn",
  encryption_mode: "SSE-KMS", # accepts SSE-KMS, SSE-S3
  name: "JobName", # required
  log_subscription: "ENABLE", # accepts ENABLE, DISABLE
  max_capacity: 1,
  max_retries: 1,
  outputs: [ # required
    {
      compression_format: "GZIP", # accepts GZIP, LZ4, SNAPPY, BZIP2, DEFLATE, LZO, BROTLI, ZSTD, ZLIB
      format: "CSV", # accepts CSV, JSON, PARQUET, GLUEPARQUET, AVRO, ORC, XML
      partition_columns: ["ColumnName"],
      location: { # required
        bucket: "Bucket", # required
        key: "Key",
      },
      overwrite: false,
    },
  ],
  role_arn: "Arn", # required
  timeout: 1,
})

Response structure


resp.name #=> String

Options Hash (options):

:encryption_key_arn (String) —
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
:encryption_mode (String) —
The encryption mode for the job, which can be one of the following:
- SSE-KMS - Server-side encryption with AWS KMS-managed keys.
- SSE-S3 - Server-side encryption with keys managed by Amazon S3.
:name (required, String) —
The name of the job to update.
:log_subscription (String) —
A value that enables or disables Amazon CloudWatch logging for the current AWS account. If logging is enabled, CloudWatch writes one log stream for each job run.
:max_capacity (Integer) —
The maximum number of nodes that DataBrew can consume when the job processes data.
:max_retries (Integer) —
The maximum number of times to retry the job after a job run fails.
:outputs (required, Array<Types::Output>) —
One or more artifacts that represent the output from running the job.
:role_arn (required, String) —
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed for this request.
:timeout (Integer) —
The job\'s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT.

Returns:

(Types::UpdateRecipeJobResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#update_schedule(options = {}) ⇒ `Types::UpdateScheduleResponse`

Modifies the definition of an existing AWS Glue DataBrew schedule in the current AWS account.

Examples:

Request syntax with placeholder values


resp = client.update_schedule({
  job_names: ["JobName"],
  cron_expression: "CronExpression", # required
  name: "ScheduleName", # required
})

Response structure


resp.name #=> String

Options Hash (options):

:job_names (Array<String>) —
The name or names of one or more jobs to be run for this schedule.
:cron_expression (required, String) —
The date or dates and time or times, in cron format, when the jobs are to be run.
:name (required, String) —
The name of the schedule to update.

Returns:

(Types::UpdateScheduleResponse) —
Returns a response object which responds to the following methods:
- #name => String

See Also:

AWS API Documentation

#wait_until(waiter_name, params = {}) {|waiter| ... } ⇒ `Boolean`

Waiters polls an API operation until a resource enters a desired state.

Basic Usage

Waiters will poll until they are succesful, they fail by entering a terminal state, or until a maximum number of attempts are made.

# polls in a loop, sleeping between attempts client.waiter_until(waiter_name, params)

Configuration

You can configure the maximum number of polling attempts, and the delay (in seconds) between each polling attempt. You configure waiters by passing a block to #wait_until:

# poll for ~25 seconds
client.wait_until(...) do |w|
  w.max_attempts = 5
  w.delay = 5
end

Callbacks

You can be notified before each polling attempt and before each delay. If you throw :success or :failure from these callbacks, it will terminate the waiter.

started_at = Time.now
client.wait_until(...) do |w|

  # disable max attempts
  w.max_attempts = nil

  # poll for 1 hour, instead of a number of attempts
  w.before_wait do |attempts, response|
    throw :failure if Time.now - started_at > 3600
  end

end

Handling Errors

When a waiter is successful, it returns true. When a waiter fails, it raises an error. All errors raised extend from Waiters::Errors::WaiterFailed.

begin
  client.wait_until(...)
rescue Aws::Waiters::Errors::WaiterFailed
  # resource did not enter the desired state in time
end

Parameters:

waiter_name (Symbol) —
The name of the waiter. See #waiter_names for a full list of supported waiters.
params (Hash) (defaults to: {}) —
Additional request parameters. See the #waiter_names for a list of supported waiters and what request they call. The called request determines the list of accepted parameters.

Yield Parameters:

waiter (Waiters::Waiter) —
Yields a Waiter object that can be configured prior to waiting.

Returns:

(Boolean) —
Returns true if the waiter was successful.

Raises:

(Errors::FailureStateError) —
Raised when the waiter terminates because the waiter has entered a state that it will not transition out of, preventing success.
(Errors::TooManyAttemptsError) —
Raised when the configured maximum number of attempts have been made, and the waiter is not yet successful.
(Errors::UnexpectedError) —
Raised when an error is encounted while polling for a resource that is not expected.
(Errors::NoSuchWaiterError) —
Raised when you request to wait for an unknown state.

#waiter_names ⇒ `Array<Symbol>`

Returns the list of supported waiters. The following table lists the supported waiters and the client method they call:

Waiter Name	Client Method	Default Delay:	Default Max Attempts:

Returns:

(Array<Symbol>) —
the list of supported waiters.

Class: Aws::GlueDataBrew::Client

Overview

Region

Credentials

Instance Attribute Summary

Attributes inherited from Seahorse::Client::Base

Constructor collapse

API Operations collapse

Instance Method Summary collapse

Methods inherited from Seahorse::Client::Base

Methods included from Seahorse::Client::HandlerBuilder

Constructor Details

#initialize(options = {}) ⇒ Aws::GlueDataBrew::Client

Instance Method Details

#batch_delete_recipe_version(options = {}) ⇒ Types::BatchDeleteRecipeVersionResponse

#create_dataset(options = {}) ⇒ Types::CreateDatasetResponse

#create_profile_job(options = {}) ⇒ Types::CreateProfileJobResponse

#create_project(options = {}) ⇒ Types::CreateProjectResponse

#create_recipe(options = {}) ⇒ Types::CreateRecipeResponse

#create_recipe_job(options = {}) ⇒ Types::CreateRecipeJobResponse

#create_schedule(options = {}) ⇒ Types::CreateScheduleResponse

#delete_dataset(options = {}) ⇒ Types::DeleteDatasetResponse

#delete_job(options = {}) ⇒ Types::DeleteJobResponse

#delete_project(options = {}) ⇒ Types::DeleteProjectResponse

#delete_recipe_version(options = {}) ⇒ Types::DeleteRecipeVersionResponse

#delete_schedule(options = {}) ⇒ Types::DeleteScheduleResponse

#describe_dataset(options = {}) ⇒ Types::DescribeDatasetResponse

#describe_job(options = {}) ⇒ Types::DescribeJobResponse

#describe_project(options = {}) ⇒ Types::DescribeProjectResponse

#describe_recipe(options = {}) ⇒ Types::DescribeRecipeResponse

#describe_schedule(options = {}) ⇒ Types::DescribeScheduleResponse

#list_datasets(options = {}) ⇒ Types::ListDatasetsResponse

#list_job_runs(options = {}) ⇒ Types::ListJobRunsResponse

#list_jobs(options = {}) ⇒ Types::ListJobsResponse

#list_projects(options = {}) ⇒ Types::ListProjectsResponse

#list_recipe_versions(options = {}) ⇒ Types::ListRecipeVersionsResponse

#list_recipes(options = {}) ⇒ Types::ListRecipesResponse

#list_schedules(options = {}) ⇒ Types::ListSchedulesResponse

#list_tags_for_resource(options = {}) ⇒ Types::ListTagsForResourceResponse

#publish_recipe(options = {}) ⇒ Types::PublishRecipeResponse

#send_project_session_action(options = {}) ⇒ Types::SendProjectSessionActionResponse

#start_job_run(options = {}) ⇒ Types::StartJobRunResponse

#start_project_session(options = {}) ⇒ Types::StartProjectSessionResponse

#stop_job_run(options = {}) ⇒ Types::StopJobRunResponse

#tag_resource(options = {}) ⇒ Struct

#untag_resource(options = {}) ⇒ Struct

#update_dataset(options = {}) ⇒ Types::UpdateDatasetResponse

#update_profile_job(options = {}) ⇒ Types::UpdateProfileJobResponse

#update_project(options = {}) ⇒ Types::UpdateProjectResponse

#update_recipe(options = {}) ⇒ Types::UpdateRecipeResponse

#update_recipe_job(options = {}) ⇒ Types::UpdateRecipeJobResponse

#update_schedule(options = {}) ⇒ Types::UpdateScheduleResponse

#wait_until(waiter_name, params = {}) {|waiter| ... } ⇒ Boolean

Basic Usage

Configuration

Callbacks

Handling Errors

#waiter_names ⇒ Array<Symbol>

#initialize(options = {}) ⇒ `Aws::GlueDataBrew::Client`

#batch_delete_recipe_version(options = {}) ⇒ `Types::BatchDeleteRecipeVersionResponse`

#create_dataset(options = {}) ⇒ `Types::CreateDatasetResponse`

#create_profile_job(options = {}) ⇒ `Types::CreateProfileJobResponse`

#create_project(options = {}) ⇒ `Types::CreateProjectResponse`

#create_recipe(options = {}) ⇒ `Types::CreateRecipeResponse`

#create_recipe_job(options = {}) ⇒ `Types::CreateRecipeJobResponse`

#create_schedule(options = {}) ⇒ `Types::CreateScheduleResponse`

#delete_dataset(options = {}) ⇒ `Types::DeleteDatasetResponse`

#delete_job(options = {}) ⇒ `Types::DeleteJobResponse`

#delete_project(options = {}) ⇒ `Types::DeleteProjectResponse`

#delete_recipe_version(options = {}) ⇒ `Types::DeleteRecipeVersionResponse`

#delete_schedule(options = {}) ⇒ `Types::DeleteScheduleResponse`

#describe_dataset(options = {}) ⇒ `Types::DescribeDatasetResponse`

#describe_job(options = {}) ⇒ `Types::DescribeJobResponse`

#describe_project(options = {}) ⇒ `Types::DescribeProjectResponse`

#describe_recipe(options = {}) ⇒ `Types::DescribeRecipeResponse`

#describe_schedule(options = {}) ⇒ `Types::DescribeScheduleResponse`

#list_datasets(options = {}) ⇒ `Types::ListDatasetsResponse`

#list_job_runs(options = {}) ⇒ `Types::ListJobRunsResponse`

#list_jobs(options = {}) ⇒ `Types::ListJobsResponse`

#list_projects(options = {}) ⇒ `Types::ListProjectsResponse`

#list_recipe_versions(options = {}) ⇒ `Types::ListRecipeVersionsResponse`

#list_recipes(options = {}) ⇒ `Types::ListRecipesResponse`

#list_schedules(options = {}) ⇒ `Types::ListSchedulesResponse`

#list_tags_for_resource(options = {}) ⇒ `Types::ListTagsForResourceResponse`

#publish_recipe(options = {}) ⇒ `Types::PublishRecipeResponse`

#send_project_session_action(options = {}) ⇒ `Types::SendProjectSessionActionResponse`

#start_job_run(options = {}) ⇒ `Types::StartJobRunResponse`

#start_project_session(options = {}) ⇒ `Types::StartProjectSessionResponse`

#stop_job_run(options = {}) ⇒ `Types::StopJobRunResponse`

#tag_resource(options = {}) ⇒ `Struct`

#untag_resource(options = {}) ⇒ `Struct`

#update_dataset(options = {}) ⇒ `Types::UpdateDatasetResponse`

#update_profile_job(options = {}) ⇒ `Types::UpdateProfileJobResponse`

#update_project(options = {}) ⇒ `Types::UpdateProjectResponse`

#update_recipe(options = {}) ⇒ `Types::UpdateRecipeResponse`

#update_recipe_job(options = {}) ⇒ `Types::UpdateRecipeJobResponse`

#update_schedule(options = {}) ⇒ `Types::UpdateScheduleResponse`

#wait_until(waiter_name, params = {}) {|waiter| ... } ⇒ `Boolean`

#waiter_names ⇒ `Array<Symbol>`