AWS services or capabilities described in AWS Documentation may vary by region/location. Click Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::Firehose::Types::ParquetSerDe

Inherits:
Struct
  • Object
show all
Defined in:
(unknown)

Overview

Note:

When passing ParquetSerDe as input to an Aws::Client method, you can use a vanilla Hash:

{
  block_size_bytes: 1,
  page_size_bytes: 1,
  compression: "UNCOMPRESSED", # accepts UNCOMPRESSED, GZIP, SNAPPY
  enable_dictionary_compression: false,
  max_padding_bytes: 1,
  writer_version: "V1", # accepts V1, V2
}

A serializer to use for converting data to the Parquet format before storing it in Amazon S3. For more information, see Apache Parquet.

Returned by:

Instance Attribute Summary collapse

Instance Attribute Details

#block_size_bytesInteger

The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.

Returns:

  • (Integer)

    The Hadoop Distributed File System (HDFS) block size.

#compressionString

The compression code to use over data blocks. The possible values are UNCOMPRESSED, SNAPPY, and GZIP, with the default being SNAPPY. Use SNAPPY for higher decompression speed. Use GZIP if the compression ratio is more important than speed.

Possible values:

  • UNCOMPRESSED
  • GZIP
  • SNAPPY

Returns:

  • (String)

    The compression code to use over data blocks.

#enable_dictionary_compressionBoolean

Indicates whether to enable dictionary compression.

Returns:

  • (Boolean)

    Indicates whether to enable dictionary compression.

#max_padding_bytesInteger

The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.

Returns:

  • (Integer)

    The maximum amount of padding to apply.

#page_size_bytesInteger

The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.

Returns:

  • (Integer)

    The Parquet page size.

#writer_versionString

Indicates the version of row format to output. The possible values are V1 and V2. The default is V1.

Possible values:

  • V1
  • V2

Returns:

  • (String)

    Indicates the version of row format to output.