AWS::DataPipeline::Pipeline - AWS CloudFormation
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

AWS::DataPipeline::Pipeline

AWS::DataPipeline::Pipeline 资源指定可用于自动处理数据的移动和转换的数据管道。在每个管道中,您可定义管道对象,例如活动、计划、数据节点和资源。有关可使用的管道对象和组件的信息,请参阅 AWS Data Pipeline 开发人员指南 中的管道对象引用

AWS::DataPipeline::Pipeline 资源将任务、时间表和前提条件添加到指定的管道。您可以使用 PutPipelineDefinition 来填充新管道。

PutPipelineDefinition 还会在将配置添加到管道时验证配置。保存对管道所做的更改,除非管道中存在下列验证错误之一。

  • 对象缺少名称或标识符字段。

  • 字符串或参考字段为空。

  • 管道中的对象数超过允许的最大对象数。

  • 管道处于 FINISHED 状态。

管道对象定义将传递到 PutPipelineDefinition 操作并由 GetPipelineDefinition 操作返回。

语法

要在 AWS CloudFormation 模板中声明此实体,请使用以下语法:

JSON

{ "Type" : "AWS::DataPipeline::Pipeline", "Properties" : { "Activate" : Boolean, "Description" : String, "Name" : String, "ParameterObjects" : [ ParameterObject, ... ], "ParameterValues" : [ ParameterValue, ... ], "PipelineObjects" : [ PipelineObject, ... ], "PipelineTags" : [ PipelineTag, ... ] } }

YAML

Type: AWS::DataPipeline::Pipeline Properties: Activate: Boolean Description: String Name: String ParameterObjects: - ParameterObject ParameterValues: - ParameterValue PipelineObjects: - PipelineObject PipelineTags: - PipelineTag

属性

Activate

指示是否验证并启动管道或停止活动管道。默认情况下, 值设为 true.

必需:否

类型:布尔值

Update requires: No interruption

Description

管道的描述。

必需:否

类型:字符串

最低0

最高1024

模式[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*

Update requires: Replacement

Name

管道的名称。

必需:是

类型:字符串

最低1

最高1024

模式[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\n\t]*

Update requires: Replacement

ParameterObjects

用于管道的参数对象。

必需:是

类型ParameterObject 的列表

Update requires: No interruption

ParameterValues

用于管道的参数值。

必需:否

类型ParameterValue 的列表

Update requires: No interruption

PipelineObjects

定义管道的对象。这些对象将覆盖现有管道定义。并不能更新所有对象、字段和值。有关限制的信息,请参阅 AWS Data Pipeline 开发人员指南 中的 编辑您的管道

必需:否

类型PipelineObject 的列表

Update requires: No interruption

PipelineTags

与管道关联的任意标签 (键值对) 的列表,该列表可用于控制权限。有关更多信息,请参阅 AWS Data Pipeline 开发人员指南 中的控制对管道和资源的访问

必需:否

类型PipelineTag 的列表

Update requires: No interruption

返回值

Ref

在将此资源的逻辑 ID 传递给内部 Ref 函数时,Ref 返回管道 ID。

For more information about using the Ref function, see Ref.

示例

以下数据管道将 Amazon DynamoDB 表中的数据备份到 Amazon S3 存储桶。管道使用 HiveCopyActivity 活动复制数据并每天运行一次。管道的角色和管道资源在相同模板中的其他位置声明。

JSON

"DynamoDBInputS3OutputHive": { "Type": "AWS::DataPipeline::Pipeline", "Properties": { "Name": "DynamoDBInputS3OutputHive", "Description": "Pipeline to backup DynamoDB data to S3", "Activate": "true", "ParameterObjects": [ { "Id": "myDDBReadThroughputRatio", "Attributes": [ { "Key": "description", "StringValue": "DynamoDB read throughput ratio" }, { "Key": "type", "StringValue": "Double" }, { "Key": "default", "StringValue": "0.2" } ] }, { "Id": "myOutputS3Loc", "Attributes": [ { "Key": "description", "StringValue": "S3 output bucket" }, { "Key": "type", "StringValue": "AWS::S3::ObjectKey" }, { "Key": "default", "StringValue": { "Fn::Join" : [ "", [ "s3://", { "Ref": "S3OutputLoc" } ] ] } } ] }, { "Id": "myDDBTableName", "Attributes": [ { "Key": "description", "StringValue": "DynamoDB Table Name " }, { "Key": "type", "StringValue": "String" } ] } ], "ParameterValues": [ { "Id": "myDDBTableName", "StringValue": { "Ref": "TableName" } } ], "PipelineObjects": [ { "Id": "S3BackupLocation", "Name": "Copy data to this S3 location", "Fields": [ { "Key": "type", "StringValue": "S3DataNode" }, { "Key": "dataFormat", "RefValue": "DDBExportFormat" }, { "Key": "directoryPath", "StringValue": "#{myOutputS3Loc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}" } ] }, { "Id": "DDBSourceTable", "Name": "DDBSourceTable", "Fields": [ { "Key": "tableName", "StringValue": "#{myDDBTableName}" }, { "Key": "type", "StringValue": "DynamoDBDataNode" }, { "Key": "dataFormat", "RefValue": "DDBExportFormat" }, { "Key": "readThroughputPercent", "StringValue": "#{myDDBReadThroughputRatio}" } ] }, { "Id": "DDBExportFormat", "Name": "DDBExportFormat", "Fields": [ { "Key": "type", "StringValue": "DynamoDBExportDataFormat" } ] }, { "Id": "TableBackupActivity", "Name": "TableBackupActivity", "Fields": [ { "Key": "resizeClusterBeforeRunning", "StringValue": "true" }, { "Key": "type", "StringValue": "HiveCopyActivity" }, { "Key": "input", "RefValue": "DDBSourceTable" }, { "Key": "runsOn", "RefValue": "EmrClusterForBackup" }, { "Key": "output", "RefValue": "S3BackupLocation" } ] }, { "Id": "DefaultSchedule", "Name": "RunOnce", "Fields": [ { "Key": "occurrences", "StringValue": "1" }, { "Key": "startAt", "StringValue": "FIRST_ACTIVATION_DATE_TIME" }, { "Key": "type", "StringValue": "Schedule" }, { "Key": "period", "StringValue": "1 Day" } ] }, { "Id": "Default", "Name": "Default", "Fields": [ { "Key": "type", "StringValue": "Default" }, { "Key": "scheduleType", "StringValue": "cron" }, { "Key": "failureAndRerunMode", "StringValue": "CASCADE" }, { "Key": "role", "StringValue": "DataPipelineDefaultRole" }, { "Key": "resourceRole", "StringValue": "DataPipelineDefaultResourceRole" }, { "Key": "schedule", "RefValue": "DefaultSchedule" } ] }, { "Id": "EmrClusterForBackup", "Name": "EmrClusterForBackup", "Fields": [ { "Key": "terminateAfter", "StringValue": "2 Hours" }, { "Key": "amiVersion", "StringValue": "3.3.2" }, { "Key": "masterInstanceType", "StringValue": "m1.medium" }, { "Key": "coreInstanceType", "StringValue": "m1.medium" }, { "Key": "coreInstanceCount", "StringValue": "1" }, { "Key": "type", "StringValue": "EmrCluster" } ] } ] } }

YAML

DynamoDBInputS3OutputHive: Type: AWS::DataPipeline::Pipeline Properties: Name: DynamoDBInputS3OutputHive Description: "Pipeline to backup DynamoDB data to S3" Activate: true ParameterObjects: - Id: "myDDBReadThroughputRatio" Attributes: - Key: "description" StringValue: "DynamoDB read throughput ratio" - Key: "type" StringValue: "Double" - Key: "default" StringValue: "0.2" - Id: "myOutputS3Loc" Attributes: - Key: "description" StringValue: "S3 output bucket" - Key: "type" StringValue: "AWS::S3::ObjectKey" - Key: "default" StringValue: Fn::Join: - "" - - "s3://" - Ref: "S3OutputLoc" - Id: "myDDBTableName" Attributes: - Key: "description" StringValue: "DynamoDB Table Name " - Key: "type" StringValue: "String" ParameterValues: - Id: "myDDBTableName" StringValue: Ref: "TableName" PipelineObjects: - Id: "S3BackupLocation" Name: "Copy data to this S3 location" Fields: - Key: "type" StringValue: "S3DataNode" - Key: "dataFormat" RefValue: "DDBExportFormat" - Key: "directoryPath" StringValue: "#{myOutputS3Loc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}" - Id: "DDBSourceTable" Name: "DDBSourceTable" Fields: - Key: "tableName" StringValue: "#{myDDBTableName}" - Key: "type" StringValue: "DynamoDBDataNode" - Key: "dataFormat" RefValue: "DDBExportFormat" - Key: "readThroughputPercent" StringValue: "#{myDDBReadThroughputRatio}" - Id: "DDBExportFormat" Name: "DDBExportFormat" Fields: - Key: "type" StringValue: "DynamoDBExportDataFormat" - Id: "TableBackupActivity" Name: "TableBackupActivity" Fields: - Key: "resizeClusterBeforeRunning" StringValue: "true" - Key: "type" StringValue: "HiveCopyActivity" - Key: "input" RefValue: "DDBSourceTable" - Key: "runsOn" RefValue: "EmrClusterForBackup" - Key: "output" RefValue: "S3BackupLocation" - Id: "DefaultSchedule" Name: "RunOnce" Fields: - Key: "occurrences" StringValue: "1" - Key: "startAt" StringValue: "FIRST_ACTIVATION_DATE_TIME" - Key: "type" StringValue: "Schedule" - Key: "period" StringValue: "1 Day" - Id: "Default" Name: "Default" Fields: - Key: "type" StringValue: "Default" - Key: "scheduleType" StringValue: "cron" - Key: "failureAndRerunMode" StringValue: "CASCADE" - Key: "role" StringValue: "DataPipelineDefaultRole" - Key: "resourceRole" StringValue: "DataPipelineDefaultResourceRole" - Key: "schedule" RefValue: "DefaultSchedule" - Id: "EmrClusterForBackup" Name: "EmrClusterForBackup" Fields: - Key: "terminateAfter" StringValue: "2 Hours" - Key: "amiVersion" StringValue: "3.3.2" - Key: "masterInstanceType" StringValue: "m1.medium" - Key: "coreInstanceType" StringValue: "m1.medium" - Key: "coreInstanceCount" StringValue: "1" - Key: "type" StringValue: "EmrCluster"

另请参阅