You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::Firehose::Types::ParquetSerDe

Inherits:

Struct

Object
Struct
Aws::Firehose::Types::ParquetSerDe

show all

Defined in:: (unknown)

Overview

Note:

When passing ParquetSerDe as input to an Aws::Client method, you can use a vanilla Hash:

{
  block_size_bytes: 1,
  page_size_bytes: 1,
  compression: "UNCOMPRESSED", # accepts UNCOMPRESSED, GZIP, SNAPPY
  enable_dictionary_compression: false,
  max_padding_bytes: 1,
  writer_version: "V1", # accepts V1, V2
}

A serializer to use for converting data to the Parquet format before storing it in Amazon S3. For more information, see Apache Parquet.

Returned by:

Serializer#parquet_ser_de

Instance Attribute Summary collapse

#block_size_bytes ⇒ Integer
The Hadoop Distributed File System (HDFS) block size.
#compression ⇒ String
The compression code to use over data blocks.
#enable_dictionary_compression ⇒ Boolean
Indicates whether to enable dictionary compression.
#max_padding_bytes ⇒ Integer
The maximum amount of padding to apply.
#page_size_bytes ⇒ Integer
The Parquet page size.
#writer_version ⇒ String
Indicates the version of row format to output.

Instance Attribute Details

#block_size_bytes ⇒ `Integer`

The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.

Returns:

(Integer) —
The Hadoop Distributed File System (HDFS) block size.

#compression ⇒ `String`

The compression code to use over data blocks. The possible values are UNCOMPRESSED, SNAPPY, and GZIP, with the default being SNAPPY. Use SNAPPY for higher decompression speed. Use GZIP if the compression ratio is more important than speed.

Possible values:

UNCOMPRESSED
GZIP
SNAPPY

Returns:

(String) —
The compression code to use over data blocks.

#enable_dictionary_compression ⇒ `Boolean`

Indicates whether to enable dictionary compression.

Returns:

(Boolean) —
Indicates whether to enable dictionary compression.

#max_padding_bytes ⇒ `Integer`

The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.

Returns:

(Integer) —
The maximum amount of padding to apply.

#page_size_bytes ⇒ `Integer`

The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.

Returns:

(Integer) —
The Parquet page size.

#writer_version ⇒ `String`

Indicates the version of row format to output. The possible values are V1 and V2. The default is V1.

Possible values:

V1
V2

Returns:

(String) —
Indicates the version of row format to output.

Class: Aws::Firehose::Types::ParquetSerDe

Overview

Instance Attribute Summary collapse

Instance Attribute Details

#block_size_bytes ⇒ Integer

#compression ⇒ String

#enable_dictionary_compression ⇒ Boolean

#max_padding_bytes ⇒ Integer

#page_size_bytes ⇒ Integer

#writer_version ⇒ String

#block_size_bytes ⇒ `Integer`

#compression ⇒ `String`

#enable_dictionary_compression ⇒ `Boolean`

#max_padding_bytes ⇒ `Integer`

#page_size_bytes ⇒ `Integer`

#writer_version ⇒ `String`