GetMLTransform - AWS Glue

GetMLTransform

Gets an AWS Glue machine learning transform artifact and all its corresponding metadata. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue. You can retrieve their metadata by calling GetMLTransform.

Request Syntax

{ "TransformId": "string" }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

TransformId

The unique identifier of the transform, generated at the time that the transform was created.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*

Required: Yes

Response Syntax

{ "CreatedOn": number, "Description": "string", "EvaluationMetrics": { "FindMatchesMetrics": { "AreaUnderPRCurve": number, "ColumnImportances": [ { "ColumnName": "string", "Importance": number } ], "ConfusionMatrix": { "NumFalseNegatives": number, "NumFalsePositives": number, "NumTrueNegatives": number, "NumTruePositives": number }, "F1": number, "Precision": number, "Recall": number }, "TransformType": "string" }, "GlueVersion": "string", "InputRecordTables": [ { "AdditionalOptions": { "string" : "string" }, "CatalogId": "string", "ConnectionName": "string", "DatabaseName": "string", "TableName": "string" } ], "LabelCount": number, "LastModifiedOn": number, "MaxCapacity": number, "MaxRetries": number, "Name": "string", "NumberOfWorkers": number, "Parameters": { "FindMatchesParameters": { "AccuracyCostTradeoff": number, "EnforceProvidedLabels": boolean, "PrecisionRecallTradeoff": number, "PrimaryKeyColumnName": "string" }, "TransformType": "string" }, "Role": "string", "Schema": [ { "DataType": "string", "Name": "string" } ], "Status": "string", "Timeout": number, "TransformEncryption": { "MlUserDataEncryption": { "KmsKeyId": "string", "MlUserDataEncryptionMode": "string" }, "TaskRunSecurityConfigurationName": "string" }, "TransformId": "string", "WorkerType": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

CreatedOn

The date and time when the transform was created.

Type: Timestamp

Description

A description of the transform.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 2048.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*

EvaluationMetrics

The latest evaluation metrics.

Type: EvaluationMetrics object

GlueVersion

This value determines which version of AWS Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see AWS Glue Versions in the developer guide.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: ^\w+\.\w+$

InputRecordTables

A list of AWS Glue table definitions used by the transform.

Type: Array of GlueTable objects

Array Members: Minimum number of 0 items. Maximum number of 10 items.

LabelCount

The number of labels available for this transform.

Type: Integer

LastModifiedOn

The date and time when the transform was last modified.

Type: Timestamp

MaxCapacity

The number of AWS Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

Type: Double

MaxRetries

The maximum number of times to retry a task for this transform after a task run fails.

Type: Integer

Name

The unique name given to the transform when it was created.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*

NumberOfWorkers

The number of workers of a defined workerType that are allocated when this task runs.

Type: Integer

Parameters

The configuration parameters that are specific to the algorithm used.

Type: TransformParameters object

Role

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions.

Type: String

Schema

The Map<Column, Type> object that represents the schema that this transform accepts. Has an upper bound of 100 columns.

Type: Array of SchemaColumn objects

Array Members: Maximum number of 100 items.

Status

The last known status of the transform (to indicate whether it can be used or not). One of "NOT_READY", "READY", or "DELETING".

Type: String

Valid Values: NOT_READY | READY | DELETING

Timeout

The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

Type: Integer

Valid Range: Minimum value of 1.

TransformEncryption

The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

Type: TransformEncryption object

TransformId

The unique identifier of the transform, generated at the time that the transform was created.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*

WorkerType

The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.

  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.

  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

Type: String

Valid Values: Standard | G.1X | G.2X | G.025X | G.4X | G.8X | Z.2X

Errors

For information about the errors that are common to all actions, see Common Errors.

EntityNotFoundException

A specified entity does not exist

HTTP Status Code: 400

InternalServiceException

An internal service error occurred.

HTTP Status Code: 500

InvalidInputException

The input provided was not valid.

HTTP Status Code: 400

OperationTimeoutException

The operation timed out.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: