You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::SageMaker::Types::TransformJobDefinition

Inherits:
Struct
  • Object
show all
Defined in:
(unknown)

Overview

Note:

When passing TransformJobDefinition as input to an Aws::Client method, you can use a vanilla Hash:

{
  max_concurrent_transforms: 1,
  max_payload_in_mb: 1,
  batch_strategy: "MultiRecord", # accepts MultiRecord, SingleRecord
  environment: {
    "TransformEnvironmentKey" => "TransformEnvironmentValue",
  },
  transform_input: { # required
    data_source: { # required
      s3_data_source: { # required
        s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile
        s3_uri: "S3Uri", # required
      },
    },
    content_type: "ContentType",
    compression_type: "None", # accepts None, Gzip
    split_type: "None", # accepts None, Line, RecordIO, TFRecord
  },
  transform_output: { # required
    s3_output_path: "S3Uri", # required
    accept: "Accept",
    assemble_with: "None", # accepts None, Line
    kms_key_id: "KmsKeyId",
  },
  transform_resources: { # required
    instance_type: "ml.m4.xlarge", # required, accepts ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge
    instance_count: 1, # required
    volume_kms_key_id: "KmsKeyId",
  },
}

Defines the input needed to run a transform job using the inference specification specified in the algorithm.

Returned by:

Instance Attribute Summary collapse

Instance Attribute Details

#batch_strategyString

A string that determines the number of records included in a single mini-batch.

SingleRecord means only one record is used per mini-batch. MultiRecord means a mini-batch is set to contain as many records that can fit within the MaxPayloadInMB limit.

Possible values:

  • MultiRecord
  • SingleRecord

Returns:

  • (String)

    A string that determines the number of records included in a single mini-batch.

#environmentHash<String,String>

The environment variables to set in the Docker container. We support up to 16 key and values entries in the map.

Returns:

  • (Hash<String,String>)

    The environment variables to set in the Docker container.

#max_concurrent_transformsInteger

The maximum number of parallel requests that can be sent to each instance in a transform job. The default value is 1.

Returns:

  • (Integer)

    The maximum number of parallel requests that can be sent to each instance in a transform job.

#max_payload_in_mbInteger

The maximum payload size allowed, in MB. A payload is the data portion of a record (without metadata).

Returns:

  • (Integer)

    The maximum payload size allowed, in MB.

#transform_inputTypes::TransformInput

A description of the input source and the way the transform job consumes it.

Returns:

  • (Types::TransformInput)

    A description of the input source and the way the transform job consumes it.

#transform_outputTypes::TransformOutput

Identifies the Amazon S3 location where you want Amazon SageMaker to save the results from the transform job.

Returns:

  • (Types::TransformOutput)

    Identifies the Amazon S3 location where you want Amazon SageMaker to save the results from the transform job.

#transform_resourcesTypes::TransformResources

Identifies the ML compute instances for the transform job.

Returns: