You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::MachineLearning::Types::CreateDataSourceFromRedshiftInput

Inherits:
Struct
  • Object
show all
Defined in:
(unknown)

Overview

Note:

When passing CreateDataSourceFromRedshiftInput as input to an Aws::Client method, you can use a vanilla Hash:

{
  data_source_id: "EntityId", # required
  data_source_name: "EntityName",
  data_spec: { # required
    database_information: { # required
      database_name: "RedshiftDatabaseName", # required
      cluster_identifier: "RedshiftClusterIdentifier", # required
    },
    select_sql_query: "RedshiftSelectSqlQuery", # required
    database_credentials: { # required
      username: "RedshiftDatabaseUsername", # required
      password: "RedshiftDatabasePassword", # required
    },
    s3_staging_location: "S3Url", # required
    data_rearrangement: "DataRearrangement",
    data_schema: "DataSchema",
    data_schema_uri: "S3Url",
  },
  role_arn: "RoleARN", # required
  compute_statistics: false,
}

Instance Attribute Summary collapse

Instance Attribute Details

#compute_statisticsBoolean

The compute statistics for a DataSource. The statistics are generated from the observation data referenced by a DataSource. Amazon ML uses the statistics internally during MLModel training. This parameter must be set to true if the DataSource needs to be used for MLModel training.

Returns:

  • (Boolean)

    The compute statistics for a DataSource.

#data_source_idString

A user-supplied ID that uniquely identifies the DataSource.

Returns:

  • (String)

    A user-supplied ID that uniquely identifies the DataSource.

#data_source_nameString

A user-supplied name or description of the DataSource.

Returns:

  • (String)

    A user-supplied name or description of the DataSource.

#data_specTypes::RedshiftDataSpec

The data specification of an Amazon Redshift DataSource:

  • DatabaseInformation - * DatabaseName - The name of the Amazon Redshift database.

    • ClusterIdentifier - The unique ID for the Amazon Redshift cluster.
  • DatabaseCredentials - The AWS Identity and Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.

  • SelectSqlQuery - The query that is used to retrieve the observation data for the Datasource.

  • S3StagingLocation - The Amazon Simple Storage Service (Amazon S3) location for staging Amazon Redshift data. The data retrieved from Amazon Redshift using the SelectSqlQuery query is stored in this location.

  • DataSchemaUri - The Amazon S3 location of the DataSchema.

  • DataSchema - A JSON string representing the schema. This is not required if DataSchemaUri is specified.

  • DataRearrangement - A JSON string that represents the splitting and rearrangement requirements for the DataSource.

    Sample - "`{\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}`"

Returns:

#role_arnString

A fully specified role Amazon Resource Name (ARN). Amazon ML assumes the role on behalf of the user to create the following:

  • A security group to allow Amazon ML to execute the SelectSqlQuery query on an Amazon Redshift cluster

  • An Amazon S3 bucket policy to grant Amazon ML read/write permissions on the S3StagingLocation

Returns:

  • (String)

    A fully specified role Amazon Resource Name (ARN).