Class KafkaStreamingSourceOptions
- All Implemented Interfaces:
Serializable
,SdkPojo
,ToCopyableBuilder<KafkaStreamingSourceOptions.Builder,
KafkaStreamingSourceOptions>
Additional options for streaming.
- See Also:
-
Nested Class Summary
-
Method Summary
Modifier and TypeMethodDescriptionfinal String
When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic.final String
assign()
The specificTopicPartitions
to consume.final String
A list of bootstrap server URLs, for example, asb-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094
.builder()
final String
An optional classification.final String
The name of the connection.final String
Specifies the delimiter character.final String
When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch.final String
The end point when a batch query is ended.final boolean
final boolean
equalsBySdkFields
(Object obj) Indicates whether some other object is "equal to" this one by SDK fields.final <T> Optional
<T> getValueForField
(String fieldName, Class<T> clazz) final int
hashCode()
final Boolean
Whether to include the Kafka headers.final Long
The rate limit on the maximum number of offsets that are processed per trigger interval.final Integer
The desired minimum number of partitions to read from Kafka.final Integer
The number of times to retry before failing to fetch Kafka offsets.final Long
The timeout in milliseconds to poll data from Kafka in Spark job executors.final Long
The time in milliseconds to wait before retrying to fetch Kafka offsets.final String
The protocol used to communicate with brokers.static Class
<? extends KafkaStreamingSourceOptions.Builder> final String
The starting position in the Kafka topic to read data from.final Instant
The timestamp of the record in the Kafka topic to start reading data from.final String
A Java regex string that identifies the topic list to subscribe to.Take this object and create a builder that contains all of the current property values of this object.final String
The topic name as specified in Apache Kafka.final String
toString()
Returns a string representation of this object.Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy
-
Method Details
-
bootstrapServers
A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094
. This option must be specified in the API call or defined in the table metadata in the Data Catalog.- Returns:
- A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094
. This option must be specified in the API call or defined in the table metadata in the Data Catalog.
-
securityProtocol
The protocol used to communicate with brokers. The possible values are
"SSL"
or"PLAINTEXT"
.- Returns:
- The protocol used to communicate with brokers. The possible values are
"SSL"
or"PLAINTEXT"
.
-
connectionName
The name of the connection.
- Returns:
- The name of the connection.
-
topicName
The topic name as specified in Apache Kafka. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.- Returns:
- The topic name as specified in Apache Kafka. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.
-
assign
The specific
TopicPartitions
to consume. You must specify at least one of"topicName"
,"assign"
or"subscribePattern"
.- Returns:
- The specific
TopicPartitions
to consume. You must specify at least one of"topicName"
,"assign"
or"subscribePattern"
.
-
subscribePattern
A Java regex string that identifies the topic list to subscribe to. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.- Returns:
- A Java regex string that identifies the topic list to subscribe to. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.
-
classification
An optional classification.
- Returns:
- An optional classification.
-
delimiter
Specifies the delimiter character.
- Returns:
- Specifies the delimiter character.
-
startingOffsets
The starting position in the Kafka topic to read data from. The possible values are
"earliest"
or"latest"
. The default value is"latest"
.- Returns:
- The starting position in the Kafka topic to read data from. The possible values are
"earliest"
or"latest"
. The default value is"latest"
.
-
endingOffsets
The end point when a batch query is ended. Possible values are either
"latest"
or a JSON string that specifies an ending offset for eachTopicPartition
.- Returns:
- The end point when a batch query is ended. Possible values are either
"latest"
or a JSON string that specifies an ending offset for eachTopicPartition
.
-
pollTimeoutMs
The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is
512
.- Returns:
- The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is
512
.
-
numRetries
The number of times to retry before failing to fetch Kafka offsets. The default value is
3
.- Returns:
- The number of times to retry before failing to fetch Kafka offsets. The default value is
3
.
-
retryIntervalMs
The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is
10
.- Returns:
- The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is
10
.
-
maxOffsetsPerTrigger
The rate limit on the maximum number of offsets that are processed per trigger interval. The specified total number of offsets is proportionally split across
topicPartitions
of different volumes. The default value is null, which means that the consumer reads all offsets until the known latest offset.- Returns:
- The rate limit on the maximum number of offsets that are processed per trigger interval. The specified
total number of offsets is proportionally split across
topicPartitions
of different volumes. The default value is null, which means that the consumer reads all offsets until the known latest offset.
-
minPartitions
The desired minimum number of partitions to read from Kafka. The default value is null, which means that the number of spark partitions is equal to the number of Kafka partitions.
- Returns:
- The desired minimum number of partitions to read from Kafka. The default value is null, which means that the number of spark partitions is equal to the number of Kafka partitions.
-
includeHeaders
Whether to include the Kafka headers. When the option is set to "true", the data output will contain an additional column named "glue_streaming_kafka_headers" with type
Array[Struct(key: String, value: String)]
. The default value is "false". This option is available in Glue version 3.0 or later only.- Returns:
- Whether to include the Kafka headers. When the option is set to "true", the data output will contain an
additional column named "glue_streaming_kafka_headers" with type
Array[Struct(key: String, value: String)]
. The default value is "false". This option is available in Glue version 3.0 or later only.
-
addRecordTimestamp
When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic. The default value is 'false'. This option is supported in Glue version 4.0 or later.
- Returns:
- When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic. The default value is 'false'. This option is supported in Glue version 4.0 or later.
-
emitConsumerLagMetrics
When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch. The metric's name is "glue.driver.streaming.maxConsumerLagInMs". The default value is 'false'. This option is supported in Glue version 4.0 or later.
- Returns:
- When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch. The metric's name is "glue.driver.streaming.maxConsumerLagInMs". The default value is 'false'. This option is supported in Glue version 4.0 or later.
-
startingTimestamp
The timestamp of the record in the Kafka topic to start reading data from. The possible values are a timestamp string in UTC format of the pattern
yyyy-mm-ddTHH:MM:SSZ
(where Z represents a UTC timezone offset with a +/-. For example: "2023-04-04T08:00:00+08:00").Only one of
StartingTimestamp
orStartingOffsets
must be set.- Returns:
- The timestamp of the record in the Kafka topic to start reading data from. The possible values are a
timestamp string in UTC format of the pattern
yyyy-mm-ddTHH:MM:SSZ
(where Z represents a UTC timezone offset with a +/-. For example: "2023-04-04T08:00:00+08:00").Only one of
StartingTimestamp
orStartingOffsets
must be set.
-
toBuilder
Description copied from interface:ToCopyableBuilder
Take this object and create a builder that contains all of the current property values of this object.- Specified by:
toBuilder
in interfaceToCopyableBuilder<KafkaStreamingSourceOptions.Builder,
KafkaStreamingSourceOptions> - Returns:
- a builder for type T
-
builder
-
serializableBuilderClass
-
hashCode
public final int hashCode() -
equals
-
equalsBySdkFields
Description copied from interface:SdkPojo
Indicates whether some other object is "equal to" this one by SDK fields. An SDK field is a modeled, non-inherited field in anSdkPojo
class, and is generated based on a service model.If an
SdkPojo
class does not have any inherited fields,equalsBySdkFields
andequals
are essentially the same.- Specified by:
equalsBySdkFields
in interfaceSdkPojo
- Parameters:
obj
- the object to be compared with- Returns:
- true if the other object equals to this object by sdk fields, false otherwise.
-
toString
Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value. -
getValueForField
-
sdkFields
-