Amazon Kinesis Data Analytics for Apache Flink 1.15.2 release
Kinesis Data Analytics supports the following new features in Apache 1.15.2
Feature | Description | Apache FLIP reference |
---|---|---|
Async Sink | An Amazon contributed framework for building async destinations that allows developers to build custom Amazon connectors with
less than half the previous effort. For more information, see The Generic Asynchronous Base Sink |
FLIP-171: Async Sink |
Kinesis Data Firehose Sink | Amazon has contributed a new Amazon Kinesis Firehose Sink using the Async framework. | Amazon Kinesis Data Firehose Sink |
Stop with Savepoint | Stop with Savepoint ensures a clean stop operation, most importantly supporting exactly-once semantics for customers that rely on them. | FLIP-34: Terminate/Suspend Job with Savepoint |
Scala Decoupling | Users can now leverage the Java API from any Scala version, including Scala 3. Customers will need to bundle the Scala standard library of their choice in their Scala applications. | FLIP-28: Long-term goal of making flink-table Scala-free |
Unified Connector Metrics | Flink has defined standard metricsnumRestarts in parallel with fullRestarts for Availability Metrics. |
FLIP-33: Standardize Connector Metrics |
Checkpointing finished tasks | This feature is enabled by default in Flink 1.15 and makes it possible to continue performing checkpoints even if parts of the job graph have finished processing all data, which might happen if it contains bounded (batch) sources. | FLIP-147: Support Checkpoints After Tasks Finished |
Changes in Amazon Kinesis Data Analytics with Apache Flink 1.15
Kinesis connectors
Kinesis Data Analytics for Apache Flink version 1.15 will automatically prevent applications from starting or updating if they are using unsupported Kinesis Connector versions (Bundled into application JARs). When upgrading to Kinesis Data Analytics for Apache Flink version 1.15, ensure that you are using the most recent Kinesis Connector.
This is for any version 1.15.2 or newer. All other versions will not be supported by Kinesis Data Analytics for Apache Flink as they may cause consistency issues or failures with the
Stop with Savepoint
feature preventing clean stop/updateoperations.
EFO connector
When upgrading to Kinesis Data Analytics for Apache Flink version 1.15, ensure that you are using the most recent
EFO Connector, that is any version 1.15.3 or newer. For more information as to why, see
FLINK-29324
Scala Decoupling
Starting with Flink 1.15.2, you will need to bundle the Scala standard library of your choice in your Scala applications.
Kinesis Data Firehose Sink
When upgrading to Kinesis Data Analytics for Apache Flink version 1.15, ensure that you are using the most recent Amazon Kinesis Data Firehose Sink
Kafka Connectors
When upgrading to Amazon Kinesis Data Analytics for Apache Flink version 1.15, ensure that you are using the most recent Kafka connector APIs.
Apache Flink has deprecated FlinkKafkaConsumer
Components
Component | Version |
---|---|
Java | 11 (recommended) |
Scala | 2.12 |
Kinesis Data Analytics Flink Runtime (aws-kinesisanalytics-runtime) | 1.2.0 |
Amazon Kinesis Connector (flink-connector-kinesis) |
1.15.2 |
Apache Beam (Beam applications only) |
2.33.0, with Jackson version 2.12.2 |
Known issues
Async Sink Performance
There is a known degradation in the 1.15 AsyncSink performance under high load scenarios compared to the legacy sink, specifically with high number of shards (64 or more). Other influential factors are larger payload sizes and higher parallelism apps.
Kinesis Data Analytics Studio
Studio utilizes Apache Zeppelin notebooks to provide a single-interface development experience for developing, debugging code, and running Apache Flink stream processing applications. An upgrade is required to Zeppelin’s Flink Interpreter to enable support of Flink 1.15. This work is scheduled with the Zeppelin community and we will update these notes when it is complete. You can continue to use Studio with Kinesis Data Analytics for Apache Flink 1.13.