Tutorial: Analyze Real-Time Stock Data Using Managed Service for Apache Flink for Flink Applications - Amazon Kinesis Data Streams
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Tutorial: Analyze Real-Time Stock Data Using Managed Service for Apache Flink for Flink Applications

The scenario for this tutorial involves ingesting stock trades into a data stream and writing a simple Amazon Managed Service for Apache Flink application that performs calculations on the stream. You will learn how to send a stream of records to Kinesis Data Streams and implement an application that consumes and processes the records in near-real time.

With Managed Service for Apache Flink for Flink Applications Applications, you can use Java or Scala to process and analyze streaming data. The service enables you to author and run Java or Scala code against streaming sources to perform time-series analytics, feed real-time dashboards, and create real-time metrics.

You can build Flink applications in Managed Service for Apache Flink using open-source libraries based on Apache Flink. Apache Flink is a popular framework and engine for processing data streams.

Important

After you create two data streams and an application, your account incurs nominal charges for Kinesis Data Streams and Managed Service for Apache Flink usage because they are not eligible for the Amazon Free Tier. When you are finished with this application, delete your Amazon resources to stop incurring charges.

The code does not access actual stock market data, but instead simulates the stream of stock trades. It does so by using a random stock trade generator. If you have access to a real-time stream of stock trades, you might be interested in deriving useful, timely statistics from that stream. For example, you might want to perform a sliding window analysis where you determine the most popular stock purchased in the last 5 minutes. Or you might want a notification whenever there is a sell order that is too large (that is, it has too many shares). You can extend the code in this series to provide such functionality.

The examples shown use the US West (Oregon) Region, but they work on any of the Amazon Regions that support Managed Service for Apache Flink.

Prerequisites for Completing the Exercises

To complete the steps in this guide, you must have the following:

  • Java Development Kit (JDK) version 8. Set the JAVA_HOME environment variable to point to your JDK install location.

  • We recommend that you use a development environment (such as Eclipse Java Neon or IntelliJ Idea) to develop and compile your application.

  • Git Client. Install the Git client if you haven't already.

  • Apache Maven Compiler Plugin. Maven must be in your working path. To test your Apache Maven installation, enter the following:

    $ mvn -version

To get started, go to Step 1: Set Up an Amazon Account and Create an Administrator User.