使用CloudFormation使用 Kinesis Data Analytics - Amazon Kinesis Data Analytics
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 Amazon Web Services 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用CloudFormation使用 Kinesis Data Analytics

以下练习演示如何启动通过以下方式创建的 Flink 应用程序:Amazon CloudFormation在同一堆栈中使用 Lambda 函数。

开始前的准备工作

在开始本练习之前,请按照以下步骤操作使用创建 Flink 应用程序Amazon CloudFormation在AWS::KinesisAnalytics** 应用程序.

编写 Lambda 函数

要在创建或更新后启动 Flink 应用程序,我们使用 kinesisanalyticsv2启动申请API。该呼叫将由Amazon CloudFormationFlink 应用程序创建之后的事件。我们将在本练习后面讨论如何设置堆栈以触发 Lambda 函数,但首先我们将重点放在 Lambda 函数声明及其代码上。我们使用Python3.8本例中的运行时间。

StartApplicationLambda: Type: AWS::Lambda::Function DependsOn: StartApplicationLambdaRole Properties: Description: Starts an application when invoked. Runtime: python3.8 Role: !GetAtt StartApplicationLambdaRole.Arn Handler: index.lambda_handler Timeout: 30 Code: ZipFile: | import logging import cfnresponse import boto3 logger = logging.getLogger() logger.setLevel(logging.INFO) def lambda_handler(event, context): logger.info('Incoming CFN event {}'.format(event)) try: application_name = event['ResourceProperties']['ApplicationName'] # filter out events other than Create or Update, # you can also omit Update in order to start an application on Create only. if event['RequestType'] not in ["Create", "Update"]: logger.info('No-op for Application {} because CFN RequestType {} is filtered'.format(application_name, event['RequestType'])) cfnresponse.send(event, context, cfnresponse.SUCCESS, {}) return # use kinesisanalyticsv2 API to start an application. client_kda = boto3.client('kinesisanalyticsv2', region_name=event['ResourceProperties']['Region']) # get application status. describe_response = client_kda.describe_application(ApplicationName=application_name) application_status = describe_response['ApplicationDetail']['ApplicationStatus'] # an application can be started from 'READY' status only. if application_status != 'READY': logger.info('No-op for Application {} because ApplicationStatus {} is filtered'.format(application_name, application_status)) cfnresponse.send(event, context, cfnresponse.SUCCESS, {}) return # create RunConfiguration. run_configuration = { 'ApplicationRestoreConfiguration': { 'ApplicationRestoreType': 'RESTORE_FROM_LATEST_SNAPSHOT', } } logger.info('RunConfiguration for Application {}: {}'.format(application_name, run_configuration)) # this call doesn't wait for an application to transfer to 'RUNNING' state. client_kda.start_application(ApplicationName=application_name, RunConfiguration=run_configuration) logger.info('Started Application: {}'.format(application_name)) cfnresponse.send(event, context, cfnresponse.SUCCESS, {}) except Exception as err: logger.error(err) cfnresponse.send(event,context, cfnresponse.FAILED, {"Data": str(err)})

在上述代码中,Lambda 将处理传入Amazon CloudFormation事件,除此之外的所有内容CreateUpdate,获取应用程序状态并在状态为时启动它READY. 要获取应用程序状态,您需要创建 Lambda 角色,如下所示:

创建 Lambda 角色

您可以为 Lambda 创建一个角色,以便成功与应用程序 “交谈” 并编写日志。此角色将使用默认托管策略,但是您可能希望使用自定义策略缩小其范围。

StartApplicationLambdaRole: Type: AWS::IAM::Role DependsOn: TestFlinkApplication Properties: Description: A role for lambda to use while interacting with an application. AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Service: - lambda.amazonaws.com Action: - sts:AssumeRole ManagedPolicyArns: - arn:aws:iam::aws:policy/AmazonKinesisAnalyticsFullAccess - arn:aws:iam::aws:policy/CloudWatchLogsFullAccess Path: /

请注意,Lambda 资源将在同一堆栈中创建 Flink 应用程序之后创建,因为它们依赖于它。

调用 Lambda 函数

现在剩下的就是调用 Lambda 函数。这可以使用自定义资源.

StartApplicationLambdaInvoke: Description: Invokes StartApplicationLambda to start an application. Type: AWS::CloudFormation::CustomResource DependsOn: StartApplicationLambda Version: "1.0" Properties: ServiceToken: !GetAtt StartApplicationLambda.Arn Region: !Ref AWS::Region ApplicationName: !Ref TestFlinkApplication

这就是使用 Lambda 启动 Flink 应用程序所需的全部。现在,您已准备好创建自己的堆栈,或者使用下面的完整示例来了解所有这些步骤在实践中是如何运作的。

以下示例是上述步骤的稍微扩展版本,其中包括一个额外的RunConfiguration通过调整模板参数. 这是一个可供你尝试的工作堆栈。请务必阅读随附的说明:

Stack.yaml

Description: 'KinesisAnalyticsV2 CloudFormation Test Application' Parameters: ApplicationRestoreType: Description: ApplicationRestoreConfiguration option, can be SKIP_RESTORE_FROM_SNAPSHOT, RESTORE_FROM_LATEST_SNAPSHOT or RESTORE_FROM_CUSTOM_SNAPSHOT. Type: String Default: SKIP_RESTORE_FROM_SNAPSHOT AllowedValues: [ SKIP_RESTORE_FROM_SNAPSHOT, RESTORE_FROM_LATEST_SNAPSHOT, RESTORE_FROM_CUSTOM_SNAPSHOT ] SnapshotName: Description: ApplicationRestoreConfiguration option, name of a snapshot to restore to, used with RESTORE_FROM_CUSTOM_SNAPSHOT ApplicationRestoreType. Type: String Default: '' AllowNonRestoredState: Description: FlinkRunConfiguration option, can be true or false. Default: true Type: String AllowedValues: [ true, false ] CodeContentBucketArn: Description: ARN of a bucket with application code. Type: String CodeContentFileKey: Description: A jar filename with an application code inside a bucket. Type: String Conditions: IsSnapshotNameEmpty: !Equals [ !Ref SnapshotName, '' ] Resources: TestServiceExecutionRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Service: - kinesisanalytics.amazonaws.com Action: sts:AssumeRole ManagedPolicyArns: - arn:aws:iam::aws:policy/AmazonKinesisFullAccess - arn:aws:iam::aws:policy/AmazonS3FullAccess Path: / InputKinesisStream: Type: AWS::Kinesis::Stream Properties: ShardCount: 1 OutputKinesisStream: Type: AWS::Kinesis::Stream Properties: ShardCount: 1 TestFlinkApplication: Type: 'AWS::KinesisAnalyticsV2::Application' Properties: ApplicationName: 'CFNTestFlinkApplication' ApplicationDescription: 'Test Flink Application' RuntimeEnvironment: 'FLINK-1_13' ServiceExecutionRole: !GetAtt TestServiceExecutionRole.Arn ApplicationConfiguration: EnvironmentProperties: PropertyGroups: - PropertyGroupId: 'KinesisStreams' PropertyMap: INPUT_STREAM_NAME: !Ref InputKinesisStream OUTPUT_STREAM_NAME: !Ref OutputKinesisStream AWS_REGION: !Ref AWS::Region FlinkApplicationConfiguration: CheckpointConfiguration: ConfigurationType: 'CUSTOM' CheckpointingEnabled: True CheckpointInterval: 1500 MinPauseBetweenCheckpoints: 500 MonitoringConfiguration: ConfigurationType: 'CUSTOM' MetricsLevel: 'APPLICATION' LogLevel: 'INFO' ParallelismConfiguration: ConfigurationType: 'CUSTOM' Parallelism: 1 ParallelismPerKPU: 1 AutoScalingEnabled: True ApplicationSnapshotConfiguration: SnapshotsEnabled: True ApplicationCodeConfiguration: CodeContent: S3ContentLocation: BucketARN: !Ref CodeContentBucketArn FileKey: !Ref CodeContentFileKey CodeContentType: 'ZIPFILE' StartApplicationLambdaRole: Type: AWS::IAM::Role DependsOn: TestFlinkApplication Properties: Description: A role for lambda to use while interacting with an application. AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Service: - lambda.amazonaws.com Action: - sts:AssumeRole ManagedPolicyArns: - arn:aws:iam::aws:policy/AmazonKinesisAnalyticsFullAccess - arn:aws:iam::aws:policy/CloudWatchLogsFullAccess Path: / StartApplicationLambda: Type: AWS::Lambda::Function DependsOn: StartApplicationLambdaRole Properties: Description: Starts an application when invoked. Runtime: python3.8 Role: !GetAtt StartApplicationLambdaRole.Arn Handler: index.lambda_handler Timeout: 30 Code: ZipFile: | import logging import cfnresponse import boto3 logger = logging.getLogger() logger.setLevel(logging.INFO) def lambda_handler(event, context): logger.info('Incoming CFN event {}'.format(event)) try: application_name = event['ResourceProperties']['ApplicationName'] # filter out events other than Create or Update, # you can also omit Update in order to start an application on Create only. if event['RequestType'] not in ["Create", "Update"]: logger.info('No-op for Application {} because CFN RequestType {} is filtered'.format(application_name, event['RequestType'])) cfnresponse.send(event, context, cfnresponse.SUCCESS, {}) return # use kinesisanalyticsv2 API to start an application. client_kda = boto3.client('kinesisanalyticsv2', region_name=event['ResourceProperties']['Region']) # get application status. describe_response = client_kda.describe_application(ApplicationName=application_name) application_status = describe_response['ApplicationDetail']['ApplicationStatus'] # an application can be started from 'READY' status only. if application_status != 'READY': logger.info('No-op for Application {} because ApplicationStatus {} is filtered'.format(application_name, application_status)) cfnresponse.send(event, context, cfnresponse.SUCCESS, {}) return # create RunConfiguration from passed parameters. run_configuration = { 'FlinkRunConfiguration': { 'AllowNonRestoredState': event['ResourceProperties']['AllowNonRestoredState'] == 'true' }, 'ApplicationRestoreConfiguration': { 'ApplicationRestoreType': event['ResourceProperties']['ApplicationRestoreType'], } } # add SnapshotName to RunConfiguration if specified. if event['ResourceProperties']['SnapshotName'] != '': run_configuration['ApplicationRestoreConfiguration']['SnapshotName'] = event['ResourceProperties']['SnapshotName'] logger.info('RunConfiguration for Application {}: {}'.format(application_name, run_configuration)) # this call doesn't wait for an application to transfer to 'RUNNING' state. client_kda.start_application(ApplicationName=application_name, RunConfiguration=run_configuration) logger.info('Started Application: {}'.format(application_name)) cfnresponse.send(event, context, cfnresponse.SUCCESS, {}) except Exception as err: logger.error(err) cfnresponse.send(event,context, cfnresponse.FAILED, {"Data": str(err)}) StartApplicationLambdaInvoke: Description: Invokes StartApplicationLambda to start an application. Type: AWS::CloudFormation::CustomResource DependsOn: StartApplicationLambda Version: "1.0" Properties: ServiceToken: !GetAtt StartApplicationLambda.Arn Region: !Ref AWS::Region ApplicationName: !Ref TestFlinkApplication ApplicationRestoreType: !Ref ApplicationRestoreType SnapshotName: !Ref SnapshotName AllowNonRestoredState: !Ref AllowNonRestoredState

同样,您可能希望调整 Lambda 的角色以及应用程序本身的角色。

在创建上面的堆栈之前,别忘记指定你的参数。

参数 .json

[ { "ParameterKey": "CodeContentBucketArn", "ParameterValue": "YOUR_BUCKET_ARN" }, { "ParameterKey": "CodeContentFileKey", "ParameterValue": "YOUR_JAR" }, { "ParameterKey": "ApplicationRestoreType", "ParameterValue": "SKIP_RESTORE_FROM_SNAPSHOT" }, { "ParameterKey": "AllowNonRestoredState", "ParameterValue": "true" } ]

ReplaceYOUR_BUCKET_ARNYOUR_JAR符合你的具体要求。你可以按照此操作指南创建 Amazon S3 存储桶和应用程序 jar。

现在创建堆栈(用你选择的区域替换 YOUR_REGION,例如 us-east-1):

aws cloudformation create-stack --region YOUR_REGION --template-body "file://stack.yaml" --parameters "file://parameters.json" --stack-name "TestKDAStack" --capabilities CAPABILITY_NAMED_IAM

现在,您可转至https://console.aws.amazon.com/cloudformation然后查看进度。创建之后,你应该在中看到你的 Flink 应用程序Starting状态。开始可能需要几分钟的时间Running.

有关更多信息,请参阅下列内容: