BlazingText Hyperparameters - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

BlazingText Hyperparameters

When you start a training job with a CreateTrainingJob request, you specify a training algorithm. You can also specify algorithm-specific hyperparameters as string-to-string maps. The hyperparameters for the BlazingText algorithm depend on which mode you use: Word2Vec (unsupervised) and Text Classification (supervised).

Word2Vec Hyperparameters

The following table lists the hyperparameters for the BlazingText Word2Vec training algorithm provided by Amazon SageMaker.

Parameter Name Description
mode

The Word2vec architecture used for training.

Required

Valid values: batch_skipgram, skipgram, or cbow

batch_size

The size of each batch when mode is set to batch_skipgram. Set to a number between 10 and 20.

Optional

Valid values: Positive integer

Default value: 11

buckets

The number of hash buckets to use for subwords.

Optional

Valid values: positive integer

Default value: 2000000

epochs

The number of complete passes through the training data.

Optional

Valid values: Positive integer

Default value: 5

evaluation

Whether the trained model is evaluated using the WordSimilarity-353 Test.

Optional

Valid values: (Boolean) True or False

Default value: True

learning_rate

The step size used for parameter updates.

Optional

Valid values: Positive float

Default value: 0.05

min_char

The minimum number of characters to use for subwords/character n-grams.

Optional

Valid values: positive integer

Default value: 3

min_count

Words that appear less than min_count times are discarded.

Optional

Valid values: Non-negative integer

Default value: 5

max_char

The maximum number of characters to use for subwords/character n-grams

Optional

Valid values: positive integer

Default value: 6

negative_samples

The number of negative samples for the negative sample sharing strategy.

Optional

Valid values: Positive integer

Default value: 5

sampling_threshold

The threshold for the occurrence of words. Words that appear with higher frequency in the training data are randomly down-sampled.

Optional

Valid values: Positive fraction. The recommended range is (0, 1e-3]

Default value: 0.0001

subwords

Whether to learn subword embeddings on not.

Optional

Valid values: (Boolean) True or False

Default value: False

vector_dim

The dimension of the word vectors that the algorithm learns.

Optional

Valid values: Positive integer

Default value: 100

window_size

The size of the context window. The context window is the number of words surrounding the target word used for training.

Optional

Valid values: Positive integer

Default value: 5

Text Classification Hyperparameters

The following table lists the hyperparameters for the Text Classification training algorithm provided by Amazon SageMaker.

Note

Although some of the parameters are common between the Text Classification and Word2Vec modes, they might have different meanings depending on the context.

Parameter Name Description
mode

The training mode.

Required

Valid values: supervised

buckets

The number of hash buckets to use for word n-grams.

Optional

Valid values: Positive integer

Default value: 2000000

early_stopping

Whether to stop training if validation accuracy doesn't improve after a patience number of epochs. Note that a validation channel is required if early stopping is used.

Optional

Valid values: (Boolean) True or False

Default value: False

epochs

The maximum number of complete passes through the training data.

Optional

Valid values: Positive integer

Default value: 5

learning_rate

The step size used for parameter updates.

Optional

Valid values: Positive float

Default value: 0.05

min_count

Words that appear less than min_count times are discarded.

Optional

Valid values: Non-negative integer

Default value: 5

min_epochs

The minimum number of epochs to train before early stopping logic is invoked.

Optional

Valid values: Positive integer

Default value: 5

patience

The number of epochs to wait before applying early stopping when no progress is made on the validation set. Used only when early_stopping is True.

Optional

Valid values: Positive integer

Default value: 4

vector_dim

The dimension of the embedding layer.

Optional

Valid values: Positive integer

Default value: 100

word_ngrams

The number of word n-gram features to use.

Optional

Valid values: Positive integer

Default value: 2