Step 8: Use a blueprint to create a workflow
In order to read the CloudTrail logs, understand their structure, create the appropriate tables in the Data Catalog, we need to set up a workflow that consists of a Amazon Glue crawlers, jobs, triggers and workflows. Lake Formation's blueprints simplifies this process.
The workflow generates the jobs, crawlers, and triggers that discover and ingest data into your data lake. You create a workflow based on one of the predefined Lake Formation blueprints.
-
In the Lake Formation console, in the navigation pane, choose Blueprints, and then choose Use blueprint.
-
On the Use a blueprint page, under Blueprint type, choose Amazon CloudTrail.
-
Under Import source, choose a CloudTrail source and start date.
-
Under Import target, specify these parameters:
Target database lakeformation_cloudtrail
Target storage location s3://
<yourName>
-datalake-cloudtrailData format Parquet -
For import frequency, choose Run on demand.
-
Under Import options, specify these parameters:
Workflow name lakeformationcloudtrailtest
IAM role LakeFormationWorkflowRole
Table prefix cloudtrailtest
Note
Must be lower case.
-
Choose Create, and wait for the console to report that the workflow was successfully created.
Tip
Did you get the following error message?
User: arn:aws:iam::
<account-id>
:user/<datalake_administrator_user>
is not authorized to perform: iam:PassRole on resource:arn:aws:iam::<account-id>
:role/LakeFormationWorkflowRole...If so, check that you replaced
<account-id>
in the inline policy for the data lake administrator user with a valid Amazon account number.