本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
使用 CreateDataset API 通过 Java 和Amazon CLI
创建数据集。数据集通过应用以下方法存储从数据存储中检索到的数据queryAction
(一个 SQL 查询)或containerAction
(执行容器化应用程序)。此操作创建数据集的框架。可以通过调用手动填充数据集CreateDatasetContent
或者根据 a 自动生成trigger
您指定。有关更多信息,请参阅 。CreateDataset和CreateDatasetContent
.
主题
示例 1 — 创建 SQL 数据集 (java)
CreateDatasetRequest request = new CreateDatasetRequest(); request.setDatasetName(dataSetName); DatasetAction action = new DatasetAction(); //Create Action action.setActionName("SQLAction1"); action.setQueryAction(new SqlQueryDatasetAction().withSqlQuery("select * from DataStoreName")); // Add Action to Actions List List<DatasetAction> actions = new ArrayList<DatasetAction>(); actions.add(action); //Create Trigger DatasetTrigger trigger = new DatasetTrigger(); trigger.setSchedule(new Schedule().withExpression("cron(0 12 * * ? *)")); //Add Trigger to Triggers List List<DatasetTrigger> triggers = new ArrayList<DatasetTrigger>(); triggers.add(trigger); // Add Triggers and Actions to CreateDatasetRequest object request.setActions(actions); request.setTriggers(triggers); // Add RetentionPeriod to CreateDatasetRequest object request.setRetentionPeriod(new RetentionPeriod().withNumberOfDays(10)); final CreateDatasetResult result = iot.createDataset(request);
成功时的输出:
{DatasetName: <datatsetName>, DatasetArn: <datatsetARN>, RetentionPeriod: {unlimited: true} or {numberOfDays: 10, unlimited: false}}
示例 2 — 使用增量窗口创建 SQL 数据集 (java)
CreateDatasetRequest request = new CreateDatasetRequest(); request.setDatasetName(dataSetName); DatasetAction action = new DatasetAction(); //Create Filter for DeltaTime QueryFilter deltaTimeFilter = new QueryFilter(); deltaTimeFilter.withDeltaTime( new DeltaTime() .withOffsetSeconds(-1 * EstimatedDataDelayInSeconds) .withTimeExpression("from_unixtime(timestamp)")); //Create Action action.setActionName("SQLActionWithDeltaTime"); action.setQueryAction(new SqlQueryDatasetAction() .withSqlQuery("SELECT * from DataStoreName") .withFilters(deltaTimeFilter)); // Add Action to Actions List List<DatasetAction> actions = new ArrayList<DatasetAction>(); actions.add(action); //Create Trigger DatasetTrigger trigger = new DatasetTrigger(); trigger.setSchedule(new Schedule().withExpression("cron(0 12 * * ? *)")); //Add Trigger to Triggers List List<DatasetTrigger> triggers = new ArrayList<DatasetTrigger>(); triggers.add(trigger); // Add Triggers and Actions to CreateDatasetRequest object request.setActions(actions); request.setTriggers(triggers); // Add RetentionPeriod to CreateDatasetRequest object request.setRetentionPeriod(new RetentionPeriod().withNumberOfDays(10)); final CreateDatasetResult result = iot.createDataset(request);
成功时的输出:
{DatasetName: <datatsetName>, DatasetArn: <datatsetARN>, RetentionPeriod: {unlimited: true} or {numberOfDays: 10, unlimited: false}}
示例 3 — 使用自己的调度触发器 (java) 创建容器数据集
CreateDatasetRequest request = new CreateDatasetRequest(); request.setDatasetName(dataSetName); DatasetAction action = new DatasetAction(); //Create Action action.setActionName("ContainerActionDataset"); action.setContainerAction(new ContainerDatasetAction() .withImage(ImageURI) .withExecutionRoleArn(ExecutionRoleArn) .withResourceConfiguration( new ResourceConfiguration() .withComputeType(new ComputeType().withAcu(1)) .withVolumeSizeInGB(1)) .withVariables(new Variable() .withName("VariableName") .withStringValue("VariableValue")); // Add Action to Actions List List<DatasetAction> actions = new ArrayList<DatasetAction>(); actions.add(action); //Create Trigger DatasetTrigger trigger = new DatasetTrigger(); trigger.setSchedule(new Schedule().withExpression("cron(0 12 * * ? *)")); //Add Trigger to Triggers List List<DatasetTrigger> triggers = new ArrayList<DatasetTrigger>(); triggers.add(trigger); // Add Triggers and Actions to CreateDatasetRequest object request.setActions(actions); request.setTriggers(triggers); // Add RetentionPeriod to CreateDatasetRequest object request.setRetentionPeriod(new RetentionPeriod().withNumberOfDays(10)); final CreateDatasetResult result = iot.createDataset(request);
成功时的输出:
{DatasetName: <datatsetName>, DatasetArn: <datatsetARN>, RetentionPeriod: {unlimited: true} or {numberOfDays: 10, unlimited: false}}
示例 4 — 使用 SQL 数据集作为触发器创建容器数据集 (java)
CreateDatasetRequest request = new CreateDatasetRequest(); request.setDatasetName(dataSetName); DatasetAction action = new DatasetAction(); //Create Action action.setActionName("ContainerActionDataset"); action.setContainerAction(new ContainerDatasetAction() .withImage(ImageURI) .withExecutionRoleArn(ExecutionRoleArn) .withResourceConfiguration( new ResourceConfiguration() .withComputeType(new ComputeType().withAcu(1)) .withVolumeSizeInGB(1)) .withVariables(new Variable() .withName("VariableName") .withStringValue("VariableValue")); // Add Action to Actions List List<DatasetAction> actions = new ArrayList<DatasetAction>(); actions.add(action); //Create Trigger DatasetTrigger trigger = new DatasetTrigger() .withDataset(new TriggeringDataset() .withName(TriggeringSQLDataSetName)); //Add Trigger to Triggers List List<DatasetTrigger> triggers = new ArrayList<DatasetTrigger>(); triggers.add(trigger); // Add Triggers and Actions to CreateDatasetRequest object request.setActions(actions); request.setTriggers(triggers); final CreateDatasetResult result = iot.createDataset(request);
成功时的输出:
{DatasetName: <datatsetName>, DatasetArn: <datatsetARN>}
示例 5 — 创建 SQL 数据集 (CLI)
aws iotanalytics --endpoint <EndPoint> --region <Region> create-dataset --dataset-name="<dataSetName>" --actions="[{\"actionName\":\"<ActionName>\", \"queryAction\":{\"sqlQuery\":\"<SQLQuery>\"}}]" --retentionPeriod numberOfDays=10
成功时的输出:
{ "datasetName": "<datasetName>", "datasetArn": "<datatsetARN>", "retentionPeriod": {unlimited: true} or {numberOfDays: 10, unlimited: false} }
示例 6 — 使用增量窗口 (CLI) 创建 SQL 数据集
增量窗口是一系列用户定义的、非重叠的连续时间间隔。增量窗口使您可以使用上次分析后到达数据存储的新数据创建数据集内容,并对这些数据执行分析。您可以通过设置来创建增量窗口deltaTime
在里面filters
a的一部分queryAction
数据集 (CreateDataset)。通常,你需要通过设置时间间隔触发器来自动创建数据集内容 (triggers:schedule:expression
)。基本上,这使您能够筛选在特定时间段内到达的消息,因此来自先前时间窗口的消息中包含的数据不会被计算两次。
在此示例中,我们创建了一个新的数据集,该数据集仅使用自上次以来到达的数据,每 15 分钟自动创建新的数据集内容。我们指定 3 分钟(180 秒)的 deltaTime
偏差,以允许到达指定数据存储的消息延迟 3 分钟。因此,如果数据集内容是在上午 10:30 创建的,则使用的数据(包含在数据集内容中)将是时间戳在上午 10:12 到上午 10:27(即上午 10:30-15 分钟-3 分钟到上午 10:30-3 分钟)之间的数据。
aws iotanalytics --endpoint <EndPoint> --region <Region> create-dataset --cli-input-json file://delta-window.json
文件在哪里delta-window.json
包含以下内容。
{ "datasetName": "delta_window_example", "actions": [ { "actionName": "delta_window_action", "queryAction": { "sqlQuery": "SELECT temperature, humidity, timestamp FROM my_datastore", "filters": [ { "deltaTime": { "offsetSeconds": -180, "timeExpression": "from_unixtime(timestamp)" } } ] } } ], "triggers": [ { "schedule": { "expression": "cron(0/15 * * * ? *)" } } ] }
成功时的输出:
{ "datasetName": "<datasetName>", "datasetArn": "<datatsetARN>", }