You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::Firehose::Types::DataFormatConversionConfiguration

Inherits:
Struct
  • Object
show all
Defined in:
(unknown)

Overview

Note:

When passing DataFormatConversionConfiguration as input to an Aws::Client method, you can use a vanilla Hash:

{
  schema_configuration: {
    role_arn: "NonEmptyStringWithoutWhitespace",
    catalog_id: "NonEmptyStringWithoutWhitespace",
    database_name: "NonEmptyStringWithoutWhitespace",
    table_name: "NonEmptyStringWithoutWhitespace",
    region: "NonEmptyStringWithoutWhitespace",
    version_id: "NonEmptyStringWithoutWhitespace",
  },
  input_format_configuration: {
    deserializer: {
      open_x_json_ser_de: {
        convert_dots_in_json_keys_to_underscores: false,
        case_insensitive: false,
        column_to_json_key_mappings: {
          "NonEmptyStringWithoutWhitespace" => "NonEmptyString",
        },
      },
      hive_json_ser_de: {
        timestamp_formats: ["NonEmptyString"],
      },
    },
  },
  output_format_configuration: {
    serializer: {
      parquet_ser_de: {
        block_size_bytes: 1,
        page_size_bytes: 1,
        compression: "UNCOMPRESSED", # accepts UNCOMPRESSED, GZIP, SNAPPY
        enable_dictionary_compression: false,
        max_padding_bytes: 1,
        writer_version: "V1", # accepts V1, V2
      },
      orc_ser_de: {
        stripe_size_bytes: 1,
        block_size_bytes: 1,
        row_index_stride: 1,
        enable_padding: false,
        padding_tolerance: 1.0,
        compression: "NONE", # accepts NONE, ZLIB, SNAPPY
        bloom_filter_columns: ["NonEmptyStringWithoutWhitespace"],
        bloom_filter_false_positive_probability: 1.0,
        dictionary_key_threshold: 1.0,
        format_version: "V0_11", # accepts V0_11, V0_12
      },
    },
  },
  enabled: false,
}

Specifies that you want Kinesis Data Firehose to convert data from the JSON format to the Parquet or ORC format before writing it to Amazon S3. Kinesis Data Firehose uses the serializer and deserializer that you specify, in addition to the column information from the AWS Glue table, to deserialize your input data from JSON and then serialize it to the Parquet or ORC format. For more information, see Kinesis Data Firehose Record Format Conversion.

Returned by:

Instance Attribute Summary collapse

Instance Attribute Details

#enabledBoolean

Defaults to true. Set it to false if you want to disable format conversion while preserving the configuration details.

Returns:

  • (Boolean)

    Defaults to true.

#input_format_configurationTypes::InputFormatConfiguration

Specifies the deserializer that you want Kinesis Data Firehose to use to convert the format of your data from JSON. This parameter is required if Enabled is set to true.

Returns:

  • (Types::InputFormatConfiguration)

    Specifies the deserializer that you want Kinesis Data Firehose to use to convert the format of your data from JSON.

#output_format_configurationTypes::OutputFormatConfiguration

Specifies the serializer that you want Kinesis Data Firehose to use to convert the format of your data to the Parquet or ORC format. This parameter is required if Enabled is set to true.

Returns:

  • (Types::OutputFormatConfiguration)

    Specifies the serializer that you want Kinesis Data Firehose to use to convert the format of your data to the Parquet or ORC format.

#schema_configurationTypes::SchemaConfiguration

Specifies the AWS Glue Data Catalog table that contains the column information. This parameter is required if Enabled is set to true.

Returns: