You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.
Class: Aws::SageMaker::Types::Channel
- Inherits:
-
Struct
- Object
- Struct
- Aws::SageMaker::Types::Channel
- Defined in:
- (unknown)
Overview
When passing Channel as input to an Aws::Client method, you can use a vanilla Hash:
{
channel_name: "ChannelName", # required
data_source: { # required
s3_data_source: {
s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile
s3_uri: "S3Uri", # required
s3_data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key
attribute_names: ["AttributeName"],
},
file_system_data_source: {
file_system_id: "FileSystemId", # required
file_system_access_mode: "rw", # required, accepts rw, ro
file_system_type: "EFS", # required, accepts EFS, FSxLustre
directory_path: "DirectoryPath", # required
},
},
content_type: "ContentType",
compression_type: "None", # accepts None, Gzip
record_wrapper_type: "None", # accepts None, RecordIO
input_mode: "Pipe", # accepts Pipe, File
shuffle_config: {
seed: 1, # required
},
}
A channel is a named input source that training algorithms can consume.
Instance Attribute Summary collapse
-
#channel_name ⇒ String
The name of the channel.
-
#compression_type ⇒ String
If training data is compressed, the compression type.
-
#content_type ⇒ String
The MIME type of the data.
-
#data_source ⇒ Types::DataSource
The location of the channel data.
-
#input_mode ⇒ String
(Optional) The input mode to use for the data channel in a training job.
-
#record_wrapper_type ⇒ String
Specify RecordIO as the value when input data is in raw format but the training algorithm requires the RecordIO format.
-
#shuffle_config ⇒ Types::ShuffleConfig
A configuration for a shuffle option for input data in a channel.
Instance Attribute Details
#channel_name ⇒ String
The name of the channel.
#compression_type ⇒ String
If training data is compressed, the compression type. The default value
is None
. CompressionType
is used only in Pipe input mode. In File
mode, leave this field unset or set it to None.
Possible values:
- None
- Gzip
#content_type ⇒ String
The MIME type of the data.
#data_source ⇒ Types::DataSource
The location of the channel data.
#input_mode ⇒ String
(Optional) The input mode to use for the data channel in a training job.
If you don\'t set a value for InputMode
, Amazon SageMaker uses the
value set for TrainingInputMode
. Use this parameter to override the
TrainingInputMode
setting in a AlgorithmSpecification request
when you have a channel that needs a different input mode from the
training job\'s general setting. To download the data from Amazon Simple
Storage Service (Amazon S3) to the provisioned ML storage volume, and
mount the directory to a Docker volume, use File
input mode. To stream
data directly from Amazon S3 to the container, choose Pipe
input mode.
To use a model for incremental training, choose File
input model.
Possible values:
- Pipe
- File
#record_wrapper_type ⇒ String
Specify RecordIO as the value when input data is in raw format but the training algorithm requires the RecordIO format. In this case, Amazon SageMaker wraps each individual S3 object in a RecordIO record. If the input data is already in RecordIO format, you don\'t need to set this attribute. For more information, see Create a Dataset Using RecordIO.
In File mode, leave this field unset or set it to None.
#shuffle_config ⇒ Types::ShuffleConfig
A configuration for a shuffle option for input data in a channel. If you
use S3Prefix
for S3DataType
, this shuffles the results of the S3 key
prefix matches. If you use ManifestFile
, the order of the S3 object
references in the ManifestFile
is shuffled. If you use
AugmentedManifestFile
, the order of the JSON lines in the
AugmentedManifestFile
is shuffled. The shuffling order is determined
using the Seed
value.
For Pipe input mode, shuffling is done at the start of every epoch. With
large datasets this ensures that the order of the training data is
different for each epoch, it helps reduce bias and possible overfitting.
In a multi-node training job when ShuffleConfig is combined with
S3DataDistributionType
of ShardedByS3Key
, the data is shuffled
across nodes so that the content sent to a particular node on the first
epoch might be sent to a different node on the second epoch.