You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.
Class: Aws::Firehose::Types::DataFormatConversionConfiguration
- Inherits:
-
Struct
- Object
- Struct
- Aws::Firehose::Types::DataFormatConversionConfiguration
- Defined in:
- (unknown)
Overview
When passing DataFormatConversionConfiguration as input to an Aws::Client method, you can use a vanilla Hash:
{
schema_configuration: {
role_arn: "NonEmptyStringWithoutWhitespace",
catalog_id: "NonEmptyStringWithoutWhitespace",
database_name: "NonEmptyStringWithoutWhitespace",
table_name: "NonEmptyStringWithoutWhitespace",
region: "NonEmptyStringWithoutWhitespace",
version_id: "NonEmptyStringWithoutWhitespace",
},
input_format_configuration: {
deserializer: {
open_x_json_ser_de: {
convert_dots_in_json_keys_to_underscores: false,
case_insensitive: false,
column_to_json_key_mappings: {
"NonEmptyStringWithoutWhitespace" => "NonEmptyString",
},
},
hive_json_ser_de: {
timestamp_formats: ["NonEmptyString"],
},
},
},
output_format_configuration: {
serializer: {
parquet_ser_de: {
block_size_bytes: 1,
page_size_bytes: 1,
compression: "UNCOMPRESSED", # accepts UNCOMPRESSED, GZIP, SNAPPY
enable_dictionary_compression: false,
max_padding_bytes: 1,
writer_version: "V1", # accepts V1, V2
},
orc_ser_de: {
stripe_size_bytes: 1,
block_size_bytes: 1,
row_index_stride: 1,
enable_padding: false,
padding_tolerance: 1.0,
compression: "NONE", # accepts NONE, ZLIB, SNAPPY
bloom_filter_columns: ["NonEmptyStringWithoutWhitespace"],
bloom_filter_false_positive_probability: 1.0,
dictionary_key_threshold: 1.0,
format_version: "V0_11", # accepts V0_11, V0_12
},
},
},
enabled: false,
}
Specifies that you want Kinesis Data Firehose to convert data from the JSON format to the Parquet or ORC format before writing it to Amazon S3. Kinesis Data Firehose uses the serializer and deserializer that you specify, in addition to the column information from the AWS Glue table, to deserialize your input data from JSON and then serialize it to the Parquet or ORC format. For more information, see Kinesis Data Firehose Record Format Conversion.
Returned by:
Instance Attribute Summary collapse
-
#enabled ⇒ Boolean
Defaults to
true
. -
#input_format_configuration ⇒ Types::InputFormatConfiguration
Specifies the deserializer that you want Kinesis Data Firehose to use to convert the format of your data from JSON.
-
#output_format_configuration ⇒ Types::OutputFormatConfiguration
Specifies the serializer that you want Kinesis Data Firehose to use to convert the format of your data to the Parquet or ORC format.
-
#schema_configuration ⇒ Types::SchemaConfiguration
Specifies the AWS Glue Data Catalog table that contains the column information.
Instance Attribute Details
#enabled ⇒ Boolean
Defaults to true
. Set it to false
if you want to disable format
conversion while preserving the configuration details.
#input_format_configuration ⇒ Types::InputFormatConfiguration
Specifies the deserializer that you want Kinesis Data Firehose to use to
convert the format of your data from JSON. This parameter is required if
Enabled
is set to true.
#output_format_configuration ⇒ Types::OutputFormatConfiguration
Specifies the serializer that you want Kinesis Data Firehose to use to
convert the format of your data to the Parquet or ORC format. This
parameter is required if Enabled
is set to true.
#schema_configuration ⇒ Types::SchemaConfiguration
Specifies the AWS Glue Data Catalog table that contains the column
information. This parameter is required if Enabled
is set to true.