Prerequisites for developing blueprints
To develop blueprints, you should be familiar with using Amazon Glue and writing scripts for Apache Spark ETL jobs or Python shell jobs. In addition, you must complete the following setup tasks.
-
Download four Amazon Python libraries to use in your blueprint layout scripts.
-
Set up the Amazon SDKs.
-
Set up the Amazon CLI.
Download the Python libraries
Download the following libraries from GitHub, and install them into your project:
-
https://github.com/awslabs/aws-glue-blueprint-libs/tree/master/awsglue/blueprint/base_resource.py
-
https://github.com/awslabs/aws-glue-blueprint-libs/tree/master/awsglue/blueprint/workflow.py
-
https://github.com/awslabs/aws-glue-blueprint-libs/tree/master/awsglue/blueprint/crawler.py
-
https://github.com/awslabs/aws-glue-blueprint-libs/tree/master/awsglue/blueprint/job.py
Set up the Amazon Java SDK
For the Amazon Java SDK, you must add a jar
file that includes the API for
blueprints.
-
If you haven't already done so, set up the Amazon SDK for Java.
-
For Java 1.x, follow the instructions in Set up the Amazon SDK for Java in the Amazon SDK for Java Developer Guide.
-
For Java 2.x, follow the instructions in Setting up the Amazon SDK for Java 2.x in the Amazon SDK for Java 2.x Developer Guide.
-
-
Download the client
jar
file that has access to the APIs for blueprints.-
For Java 1.x: s3://awsglue-custom-blueprints-preview-artifacts/awsglue-java-sdk-preview/AWSGlueJavaClient-1.11.x.jar
-
For Java 2.x: s3://awsglue-custom-blueprints-preview-artifacts/awsglue-java-sdk-v2-preview/AwsJavaSdk-Glue-2.0.jar
-
-
Add the client
jar
to the front of the Java classpath to override the Amazon Glue client provided by the Amazon Java SDK.export CLASSPATH=<path-to-preview-client-jar>:$CLASSPATH
-
(Optional) Test the SDK with the following Java application. The application should output an empty list.
Replace
accessKey
andsecretKey
with your credentials, and replaceus-east-1
with your Region.import com.amazonaws.auth.AWSCredentials; import com.amazonaws.auth.AWSCredentialsProvider; import com.amazonaws.auth.AWSStaticCredentialsProvider; import com.amazonaws.auth.BasicAWSCredentials; import com.amazonaws.services.glue.AWSGlue; import com.amazonaws.services.glue.AWSGlueClientBuilder; import com.amazonaws.services.glue.model.ListBlueprintsRequest; public class App{ public static void main(String[] args) { AWSCredentials credentials = new BasicAWSCredentials("accessKey", "secretKey"); AWSCredentialsProvider provider = new AWSStaticCredentialsProvider(credentials); AWSGlue glue = AWSGlueClientBuilder.standard().withCredentials(provider) .withRegion("us-east-1").build(); ListBlueprintsRequest request = new ListBlueprintsRequest().withMaxResults(2); System.out.println(glue.listBlueprints(request)); } }
Set up the Amazon Python SDK
The following steps assume that you have Python version 2.7 or later, or version 3.6 or later installed on your computer.
-
Download the following boto3 wheel file. If prompted to open or save, save the file. s3://awsglue-custom-blueprints-preview-artifacts/aws-python-sdk-preview/boto3-1.17.31-py2.py3-none-any.whl
-
Download the following botocore wheel file: s3://awsglue-custom-blueprints-preview-artifacts/aws-python-sdk-preview/botocore-1.20.31-py2.py3-none-any.whl
-
Check your Python version.
python --version
-
Depending on your Python version, enter the following commands (for Linux):
-
For Python 2.7 or later.
python3 -m pip install --user virtualenv source env/bin/activate
-
For Python 3.6 or later.
python3 -m venv python-sdk-test source python-sdk-test/bin/activate
-
-
Install the botocore wheel file.
python3 -m pip install <download-directory>/botocore-1.20.31-py2.py3-none-any.whl
-
Install the boto3 wheel file.
python3 -m pip install <download-directory>/boto3-1.17.31-py2.py3-none-any.whl
-
Configure your credentials and default region in the
~/.aws/credentials
and~/.aws/config
files. For more information, see Configuring the Amazon CLI in the Amazon Command Line Interface User Guide. -
(Optional) Test your setup. The following commands should return an empty list.
Replace
us-east-1
with your Region.$ python >>> import boto3 >>> glue = boto3.client('glue', 'us-east-1') >>> glue.list_blueprints()
Set up the preview Amazon CLI
-
If you haven't already done so, install and/or update the Amazon Command Line Interface (Amazon CLI) on your computer. The easiest way to do this is with
pip
, the Python installer utility:pip install awscli --upgrade --user
You can find complete installation instructions for the Amazon CLI here: Installing the Amazon Command Line Interface.
-
Download the Amazon CLI wheel file from: s3://awsglue-custom-blueprints-preview-artifacts/awscli-preview-build/awscli-1.19.31-py2.py3-none-any.whl
-
Install the Amazon CLI wheel file.
python3 -m pip install awscli-1.19.31-py2.py3-none-any.whl
-
Run the
aws configure
command. Configure your Amazon credentials (including access key, and secret key) and Amazon Region. You can find information on configuring the Amazon CLI here: Configuring the Amazon CLI. -
Test the Amazon CLI. The following command should return an empty list.
Replace
us-east-1
with your Region.aws glue list-blueprints --region us-east-1