Creating a dataset - Amazon IoT Analytics
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Creating a dataset

You retrieve data from a data store by creating a SQL dataset or a container dataset. Amazon IoT Analytics can query the data to answer analytical questions. Although a data store is not a database, you use SQL expressions to query the data and produce results that are stored in a dataset.

Querying data

To query the data, you create a dataset. A dataset contains the SQL that you use to query the data store along with an optional schedule that repeats the query at a day and time you choose. You create the optional schedules using expressions similar to Amazon CloudWatch schedule expressions.

Run the following command to create a dataset.

aws iotanalytics create-dataset --cli-input-json file://mydataset.json

Where the mydataset.json file contains the following content.

{ "datasetName": "mydataset", "actions": [ { "actionName":"myaction", "queryAction": { "sqlQuery": "select * from mydatastore" } } ] }

Run the following command to create the dataset content by executing the query.

aws iotanalytics create-dataset-content --dataset-name mydataset

Wait a few minutes for the dataset content to be created before you continue.

Accessing the queried data

The result of the query is your dataset content, stored as a file, in CSV format. The file is made available to you through Amazon S3. The following example shows how you can check that your results are ready and download the file.

Run the following get-dataset-content command.

aws iotanalytics get-dataset-content --dataset-name mydataset

If your dataset contains any data, then the output from get-dataset-content, has "state": "SUCCEEDED" in the status field, like this the following example.

{ "timestamp": 1508189965.746, "entries": [ { "entryName": "someEntry", "dataURI": "https://aws-iot-analytics-datasets-f7253800-859a-472c-aa33-e23998b31261.s3.amazonaws.com/results/f881f855-c873-49ce-abd9-b50e9611b71f.csv?X-Amz-" } ], "status": { "state": "SUCCEEDED", "reason": "A useful comment." } }

dataURI is a signed URL to the output results. It is valid for a short period of time (a few hours). Depending on your workflow, you might want to always call get-dataset-content before you access the content because calling this command generates a new signed URL.