Interactive analysis of streaming data - Amazon Kinesis Data Analytics
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Interactive analysis of streaming data

You use a serverless notebook powered by Apache Zeppelin to interact with your streaming data. Your notebook can have multiple notes, and each note can have one or more paragraphs where you can write your code.

The following example SQL query shows how to retrieve data from a data source:

%flink.ssql(type=update) select * from stock;

For more examples of Flink Streaming SQL queries, see Examples and tutorials following, and Queries in the Apache Flink documentation.

You can use Flink SQL queries in the Studio notebook to query streaming data. You may also use Python (Table API) and Scala (Table and Datastream APIs) to write programs to query your streaming data interactively. You can view the results of your queries or programs, update them in seconds, and re-run them to view updated results.

Flink interpreters

You specify which language Kinesis Data Analytics uses to run your application by using an interpreter. You can use the following interpreters with Kinesis Data Analytics:

Name Class Description
%flink FlinkInterpreter Creates ExecutionEnvironment/StreamExecutionEnvironment/BatchTableEnvironment/StreamTableEnvironment and provides a Scala environment
%flink.pyflink PyFlinkInterpreter Provides a python environment
%flink.ipyflink IPyFlinkInterpreter Provides an ipython environment
%flink.ssql FlinkStreamSqlInterpreter Provides a stream sql environment
%flink.bsql FlinkBatchSqlInterpreter Provides a batch sql environment

For more information about Flink interpreters, see Flink interpreter for Apache Zeppelin.

If you are using %flink.pyflink or %flink.ipyflink as your interpreters, you will need to use the ZeppelinContext to visualize the results within the notebook.

For more PyFlink specific examples, see Query your data streams interactively using Kinesis Data Analytics Studio and Python.

Apache Flink table environment variables

Apache Zeppelin provides access to table environment resources using environment variables.

You access Scala table environment resources with the following variables:

Variable Resource
senvStreamExecutionEnvironment
benvExecutionEnvironment
stenvStreamTableEnvironment for blink planner
btenvBatchTableEnvironment for blink planner
stenv_2StreamTableEnvironment for flink planner
btenv_2BatchTableEnvironment for flink planner

You access Python table environment resources with the following variables:

Variable Resource
s_envStreamExecutionEnvironment
b_envExecutionEnvironment
st_envStreamTableEnvironment for blink planner
bt_envBatchTableEnvironment for blink planner
st_env_2StreamTableEnvironment for flink planner
bt_env_2BatchTableEnvironment for flink planner

For more information about using table environments, see Create a TableEnvironment in the Apache Flink documentation.