Using a custom container for analysis - Amazon IoT Analytics
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Using a custom container for analysis

This section includes information about how to build a Docker container using a Jupyter notebook. There is a security risk if you re-use notebooks built by third parties: included containers can execute arbitrary code with your user permissions. In addition, the HTML generated by the notebook can be displayed in the Amazon IoT Analytics console, providing a potential attack vector on the computer displaying the HTML. Make sure you trust the author of any third-party notebook before using it.

You can create your own custom container and run it with the Amazon IoT Analytics service. To do so, you setup a Docker image and upload it to Amazon ECR, then set up a dataset yo run a container action. This section gives an example of the process using Octave.

This tutorial assumes that you have:

  • Octave installed on your local computer

  • A Docker account set up on your local computer

  • An Amazon account with Amazon ECR or Amazon IoT Analytics access

Step 1: Set up a Docker image

There are three main files you need for this tutorial. Their names and contents are here:

  • Dockerfile – The initial setup for Docker's containerization process.

    FROM ubuntu:16.04 # Get required set of software RUN apt-get update RUN apt-get install -y software-properties-common RUN apt-get install -y octave RUN apt-get install -y python3-pip # Get boto3 for S3 and other libraries RUN pip3 install --upgrade pip RUN pip3 install boto3 RUN pip3 install urllib3 # Move scripts over ADD moment moment ADD run-octave.py run-octave.py # Start python script ENTRYPOINT ["python3", "run-octave.py"]
  • run-octave.py – Parses JSON from Amazon IoT Analytics, runs the Octave script, and uploads artifacts to Amazon S3.

    import boto3 import json import os import sys from urllib.parse import urlparse # Parse the JSON from IoT Analytics with open('/opt/ml/input/data/iotanalytics/params') as params_file: params = json.load(params_file) variables = params['Variables'] order = variables['order'] input_s3_bucket = variables['inputDataS3BucketName'] input_s3_key = variables['inputDataS3Key'] output_s3_uri = variables['octaveResultS3URI'] local_input_filename = "input.txt" local_output_filename = "output.mat" # Pull input data from S3... s3 = boto3.resource('s3') s3.Bucket(input_s3_bucket).download_file(input_s3_key, local_input_filename) # Run Octave Script os.system("octave moment {} {} {}".format(local_input_filename, local_output_filename, order)) # # Upload the artifacts to S3 output_s3_url = urlparse(output_s3_uri) output_s3_bucket = output_s3_url.netloc output_s3_key = output_s3_url.path[1:] s3.Object(output_s3_bucket, output_s3_key).put(Body=open(local_output_filename, 'rb'), ACL='bucket-owner-full-control')
  • moment – A simple Octave script which calculates the moment based on an input or output file and a specified order.

    #!/usr/bin/octave -qf arg_list = argv (); input_filename = arg_list{1}; output_filename = arg_list{2}; order = str2num(arg_list{3}); [D,delimiterOut]=importdata(input_filename) M = moment(D, order) save(output_filename,'M')
  1. Download the contents of each file. Create a new directory and place all the files in it and then cd to that directory.

  2. Run the following command.

    docker build -t octave-moment .
  3. You should see a new image in your Docker repository. Verify it by running the following command.

    docker image ls | grep octave-moment

Step 2: Upload the Docker image to an Amazon ECR repository

  1. Create a repository in Amazon ECR.

    aws ecr create-repository --repository-name octave-moment
  2. Get the login to your Docker environment.

    aws ecr get-login
  3. Copy the output and run it. The output should look something like the following.

    docker login -u AWS -p password -e none https://your-aws-account-id.dkr.ecr..amazonaws.com
  4. Tag the image you created with the Amazon ECR repository tag.

    docker tag your-image-id your-aws-account-id.dkr.ecr.region.amazonaws.com/octave-moment
  5. Push the image to Amazon ECR.

    docker push your-aws-account-id.dkr.ecr.region.amazonaws.com/octave-moment

Step 3: Upload your sample data to an Amazon S3 bucket

  1. Download the following to file input.txt.

    0.857549 -0.987565 -0.467288 -0.252233 -2.298007 0.030077 -1.243324 -0.692745 0.563276 0.772901 -0.508862 -0.404303 -1.363477 -1.812281 -0.296744 -0.203897 0.746533 0.048276 0.075284 0.125395 0.829358 1.246402 -1.310275 -2.737117 0.024629 1.206120 0.895101 1.075549 1.897416 1.383577
  2. Create an Amazon S3 bucket called octave-sample-data-your-aws-account-id.

  3. Upload the file input.txt to the Amazon S3 bucket you just created. You should now have a bucket named octave-sample-data-your-aws-account-id that contains the input.txt file.

Step 4: Create a container execution role

  1. Copy the following to a file named role1.json. Replace your-aws-account-id with your Amazon account ID and aws-region with the Amazon region of your Amazon resources.

    Note

    This example includes a global condition context key to protect against the confused deputy security problem. For more information, see Cross-service confused deputy prevention.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": [ "sagemaker.amazonaws.com", "iotanalytics.amazonaws.com" ] }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "aws:SourceAccount": "your-aws-account-id" }, "ArnLike": { "aws:SourceArn": "arn:aws:iotanalytics:aws-region:your-aws-account-id:dataset/DOC-EXAMPLE-DATASET" } } ] }
  2. Create a role that gives access permissions to SageMaker and Amazon IoT Analytics, using the file role1.json that you downloaded.

    aws iam create-role --role-name container-execution-role --assume-role-policy-document file://role1.json
  3. Download the following to a file named policy1.json and replace your-account-id with your account ID (see the second ARN under Statement:Resource).

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:PutObject", "s3:GetObject", "s3:PutObjectAcl" ], "Resource": [ "arn:aws:s3:::*-dataset-*/*", "arn:aws:s3:::octave-sample-data-your-account-id/*" }, { "Effect": "Allow", "Action": [ "iotanalytics:*" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "ecr:GetAuthorizationToken", "ecr:GetDownloadUrlForLayer", "ecr:BatchGetImage", "ecr:BatchCheckLayerAvailability", "logs:CreateLogGroup", "logs:CreateLogStream", "logs:DescribeLogStreams", "logs:GetLogEvents", "logs:PutLogEvents" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:ListAllMyBuckets" ], "Resource" : "*" } ] }
  4. Create an IAM policy, using the policy.json file you just downloaded.

    aws iam create-policy --policy-name ContainerExecutionPolicy --policy-document file://policy1.json
  5. Attach the policy to the role.

    aws iam attach-role-policy --role-name container-execution-role --policy-arn arn:aws:iam::your-account-id:policy/ContainerExecutionPolicy

Step 5: Create a dataset with a container action

  1. Download the following to a fie named cli-input.json and replace all instances of your-account-id and region with the appropriate values.

    { "datasetName": "octave_dataset", "actions": [ { "actionName": "octave", "containerAction": { "image": "your-account-id.dkr.ecr.region.amazonaws.com/octave-moment", "executionRoleArn": "arn:aws:iam::your-account-id:role/container-execution-role", "resourceConfiguration": { "computeType": "ACU_1", "volumeSizeInGB": 1 }, "variables": [ { "name": "octaveResultS3URI", "outputFileUriValue": { "fileName": "output.mat" } }, { "name": "inputDataS3BucketName", "stringValue": "octave-sample-data-your-account-id" }, { "name": "inputDataS3Key", "stringValue": "input.txt" }, { "name": "order", "stringValue": "3" } ] } } ] }
  2. Create a dataset using the file cli-input.json you just downloaded and edited.

    aws iotanalytics create-dataset —cli-input-json file://cli-input.json

Step 6: Invoke dataset content generation

  1. Run the following command.

    aws iotanalytics create-dataset-content --dataset-name octave-dataset

Step 7: Get dataset content

  1. Run the following command.

    aws iotanalytics get-dataset-content --dataset-name octave-dataset --version-id \$LATEST
  2. You might need to wait several minutes until the DatasetContentState is SUCCEEDED.

Step 8: Print the output on Octave

  1. Use the Octave shell to print the output from the container by running the following command.

    bash> octave octave> load output.mat octave> disp(M) -0.016393 -0.098061 0.380311 -0.564377 -1.318744