You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::Glue::Types::StorageDescriptor

Inherits:

Struct

Object
Struct
Aws::Glue::Types::StorageDescriptor

show all

Defined in:: (unknown)

Overview

Note:

When passing StorageDescriptor as input to an Aws::Client method, you can use a vanilla Hash:

{
  columns: [
    {
      name: "NameString", # required
      type: "ColumnTypeString",
      comment: "CommentString",
      parameters: {
        "KeyString" => "ParametersMapValue",
      },
    },
  ],
  location: "LocationString",
  input_format: "FormatString",
  output_format: "FormatString",
  compressed: false,
  number_of_buckets: 1,
  serde_info: {
    name: "NameString",
    serialization_library: "NameString",
    parameters: {
      "KeyString" => "ParametersMapValue",
    },
  },
  bucket_columns: ["NameString"],
  sort_columns: [
    {
      column: "NameString", # required
      sort_order: 1, # required
    },
  ],
  parameters: {
    "KeyString" => "ParametersMapValue",
  },
  skewed_info: {
    skewed_column_names: ["NameString"],
    skewed_column_values: ["ColumnValuesString"],
    skewed_column_value_location_maps: {
      "ColumnValuesString" => "ColumnValuesString",
    },
  },
  stored_as_sub_directories: false,
  schema_reference: {
    schema_id: {
      schema_arn: "GlueResourceArn",
      schema_name: "SchemaRegistryNameString",
      registry_name: "SchemaRegistryNameString",
    },
    schema_version_id: "SchemaVersionIdString",
    schema_version_number: 1,
  },
}

Describes the physical storage of table data.

Returned by:

Instance Attribute Summary collapse

#bucket_columns ⇒ Array<String>
A list of reducer grouping columns, clustering columns, and bucketing columns in the table.
#columns ⇒ Array<Types::Column>
A list of the Columns in the table.
#compressed ⇒ Boolean
True if the data in the table is compressed, or False if not.
#input_format ⇒ String
The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.
#location ⇒ String
The physical location of the table.
#number_of_buckets ⇒ Integer
Must be specified if the table contains any dimension columns.
#output_format ⇒ String
The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.
#parameters ⇒ Hash<String,String>
The user-supplied properties in key-value form.
#schema_reference ⇒ Types::SchemaReference
An object that references a schema stored in the AWS Glue Schema Registry.
#serde_info ⇒ Types::SerDeInfo
The serialization/deserialization (SerDe) information.
#skewed_info ⇒ Types::SkewedInfo
The information about values that appear frequently in a column (skewed values).
#sort_columns ⇒ Array<Types::Order>
A list specifying the sort order of each bucket in the table.
#stored_as_sub_directories ⇒ Boolean
True if the table data is stored in subdirectories, or False if not.

Instance Attribute Details

#bucket_columns ⇒ `Array<String>`

A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

Returns:

(Array<String>) —
A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

#columns ⇒ `Array<Types::Column>`

A list of the Columns in the table.

Returns:

(Array<Types::Column>) —
A list of the Columns in the table.

#compressed ⇒ `Boolean`

True if the data in the table is compressed, or False if not.

Returns:

(Boolean) —
True if the data in the table is compressed, or False if not.

#input_format ⇒ `String`

The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

Returns:

(String) —
The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

#location ⇒ `String`

The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

Returns:

(String) —
The physical location of the table.

#number_of_buckets ⇒ `Integer`

Must be specified if the table contains any dimension columns.

Returns:

(Integer) —
Must be specified if the table contains any dimension columns.

#output_format ⇒ `String`

The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

Returns:

(String) —
The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

#parameters ⇒ `Hash<String,String>`

The user-supplied properties in key-value form.

Returns:

(Hash<String,String>) —
The user-supplied properties in key-value form.

#schema_reference ⇒ `Types::SchemaReference`

An object that references a schema stored in the AWS Glue Schema Registry.

When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

Returns:

(Types::SchemaReference) —
An object that references a schema stored in the AWS Glue Schema Registry.

#serde_info ⇒ `Types::SerDeInfo`

The serialization/deserialization (SerDe) information.

Returns:

(Types::SerDeInfo) —
The serialization/deserialization (SerDe) information.

#skewed_info ⇒ `Types::SkewedInfo`

The information about values that appear frequently in a column (skewed values).

Returns:

(Types::SkewedInfo) —
The information about values that appear frequently in a column (skewed values).

#sort_columns ⇒ `Array<Types::Order>`

A list specifying the sort order of each bucket in the table.

Returns:

(Array<Types::Order>) —
A list specifying the sort order of each bucket in the table.

#stored_as_sub_directories ⇒ `Boolean`

True if the table data is stored in subdirectories, or False if not.

Returns:

(Boolean) —
True if the table data is stored in subdirectories, or False if not.

Class: Aws::Glue::Types::StorageDescriptor

Overview

Instance Attribute Summary collapse

Instance Attribute Details

#bucket_columns ⇒ Array<String>

#columns ⇒ Array<Types::Column>

#compressed ⇒ Boolean

#input_format ⇒ String

#location ⇒ String

#number_of_buckets ⇒ Integer

#output_format ⇒ String

#parameters ⇒ Hash<String,String>

#schema_reference ⇒ Types::SchemaReference

#serde_info ⇒ Types::SerDeInfo

#skewed_info ⇒ Types::SkewedInfo

#sort_columns ⇒ Array<Types::Order>

#stored_as_sub_directories ⇒ Boolean

#bucket_columns ⇒ `Array<String>`

#columns ⇒ `Array<Types::Column>`

#compressed ⇒ `Boolean`

#input_format ⇒ `String`

#location ⇒ `String`

#number_of_buckets ⇒ `Integer`

#output_format ⇒ `String`

#parameters ⇒ `Hash<String,String>`

#schema_reference ⇒ `Types::SchemaReference`

#serde_info ⇒ `Types::SerDeInfo`

#skewed_info ⇒ `Types::SkewedInfo`

#sort_columns ⇒ `Array<Types::Order>`

#stored_as_sub_directories ⇒ `Boolean`