AWS services or capabilities described in AWS Documentation may vary by region/location. Click Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

Class: Aws::SageMaker::Types::TrainingJobDefinition

Inherits:
Struct
  • Object
show all
Defined in:
gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb

Overview

Note:

When making an API call, you may pass TrainingJobDefinition data as a hash:

{
  training_input_mode: "Pipe", # required, accepts Pipe, File, FastFile
  hyper_parameters: {
    "HyperParameterKey" => "HyperParameterValue",
  },
  input_data_config: [ # required
    {
      channel_name: "ChannelName", # required
      data_source: { # required
        s3_data_source: {
          s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile
          s3_uri: "S3Uri", # required
          s3_data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key
          attribute_names: ["AttributeName"],
        },
        file_system_data_source: {
          file_system_id: "FileSystemId", # required
          file_system_access_mode: "rw", # required, accepts rw, ro
          file_system_type: "EFS", # required, accepts EFS, FSxLustre
          directory_path: "DirectoryPath", # required
        },
      },
      content_type: "ContentType",
      compression_type: "None", # accepts None, Gzip
      record_wrapper_type: "None", # accepts None, RecordIO
      input_mode: "Pipe", # accepts Pipe, File, FastFile
      shuffle_config: {
        seed: 1, # required
      },
    },
  ],
  output_data_config: { # required
    kms_key_id: "KmsKeyId",
    s3_output_path: "S3Uri", # required
  },
  resource_config: { # required
    instance_type: "ml.m4.xlarge", # required, accepts ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.g4dn.xlarge, ml.g4dn.2xlarge, ml.g4dn.4xlarge, ml.g4dn.8xlarge, ml.g4dn.12xlarge, ml.g4dn.16xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.p3dn.24xlarge, ml.p4d.24xlarge, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge, ml.c5n.xlarge, ml.c5n.2xlarge, ml.c5n.4xlarge, ml.c5n.9xlarge, ml.c5n.18xlarge
    instance_count: 1, # required
    volume_size_in_gb: 1, # required
    volume_kms_key_id: "KmsKeyId",
  },
  stopping_condition: { # required
    max_runtime_in_seconds: 1,
    max_wait_time_in_seconds: 1,
  },
}

Defines the input needed to run a training job using the algorithm.

Constant Summary collapse

SENSITIVE =
[]

Instance Attribute Summary collapse

Instance Attribute Details

#hyper_parametersHash<String,String>

The hyperparameters used for the training job.

Returns:

  • (Hash<String,String>)


34013
34014
34015
34016
34017
34018
34019
34020
34021
34022
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 34013

class TrainingJobDefinition < Struct.new(
  :training_input_mode,
  :hyper_parameters,
  :input_data_config,
  :output_data_config,
  :resource_config,
  :stopping_condition)
  SENSITIVE = []
  include Aws::Structure
end

#input_data_configArray<Types::Channel>

An array of Channel objects, each of which specifies an input source.

Returns:



34013
34014
34015
34016
34017
34018
34019
34020
34021
34022
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 34013

class TrainingJobDefinition < Struct.new(
  :training_input_mode,
  :hyper_parameters,
  :input_data_config,
  :output_data_config,
  :resource_config,
  :stopping_condition)
  SENSITIVE = []
  include Aws::Structure
end

#output_data_configTypes::OutputDataConfig

the path to the S3 bucket where you want to store model artifacts. Amazon SageMaker creates subfolders for the artifacts.



34013
34014
34015
34016
34017
34018
34019
34020
34021
34022
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 34013

class TrainingJobDefinition < Struct.new(
  :training_input_mode,
  :hyper_parameters,
  :input_data_config,
  :output_data_config,
  :resource_config,
  :stopping_condition)
  SENSITIVE = []
  include Aws::Structure
end

#resource_configTypes::ResourceConfig

The resources, including the ML compute instances and ML storage volumes, to use for model training.



34013
34014
34015
34016
34017
34018
34019
34020
34021
34022
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 34013

class TrainingJobDefinition < Struct.new(
  :training_input_mode,
  :hyper_parameters,
  :input_data_config,
  :output_data_config,
  :resource_config,
  :stopping_condition)
  SENSITIVE = []
  include Aws::Structure
end

#stopping_conditionTypes::StoppingCondition

Specifies a limit to how long a model training job can run. It also specifies how long a managed Spot training job has to complete. When the job reaches the time limit, Amazon SageMaker ends the training job. Use this API to cap model training costs.

To stop a job, Amazon SageMaker sends the algorithm the SIGTERM signal, which delays job termination for 120 seconds. Algorithms can use this 120-second window to save the model artifacts.



34013
34014
34015
34016
34017
34018
34019
34020
34021
34022
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 34013

class TrainingJobDefinition < Struct.new(
  :training_input_mode,
  :hyper_parameters,
  :input_data_config,
  :output_data_config,
  :resource_config,
  :stopping_condition)
  SENSITIVE = []
  include Aws::Structure
end

#training_input_modeString

The training input mode that the algorithm supports. For more information about input modes, see Algorithms.

Pipe mode

If an algorithm supports Pipe mode, Amazon SageMaker streams data directly from Amazon S3 to the container.

File mode

If an algorithm supports File mode, SageMaker downloads the training data from S3 to the provisioned ML storage volume, and mounts the directory to the Docker volume for the training container.

You must provision the ML storage volume with sufficient capacity to accommodate the data downloaded from S3. In addition to the training data, the ML storage volume also stores the output model. The algorithm container uses the ML storage volume to also store intermediate information, if any.

For distributed algorithms, training data is distributed uniformly. Your training duration is predictable if the input data objects sizes are approximately the same. SageMaker does not split the files any further for model training. If the object sizes are skewed, training won't be optimal as the data distribution is also skewed when one host in a training cluster is overloaded, thus becoming a bottleneck in training.

FastFile mode

If an algorithm supports FastFile mode, SageMaker streams data directly from S3 to the container with no code changes, and provides file system access to the data. Users can author their training script to interact with these files as if they were stored on disk.

FastFile mode works best when the data is read sequentially. Augmented manifest files aren't supported. The startup time is lower when there are fewer files in the S3 bucket provided.

Returns:

  • (String)


34013
34014
34015
34016
34017
34018
34019
34020
34021
34022
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 34013

class TrainingJobDefinition < Struct.new(
  :training_input_mode,
  :hyper_parameters,
  :input_data_config,
  :output_data_config,
  :resource_config,
  :stopping_condition)
  SENSITIVE = []
  include Aws::Structure
end