笔记本执行 CLI 命令示例 - Amazon EMR
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

笔记本执行 CLI 命令示例

注意

EMR Notebooks 在新控制台中作为 Amazon EMR Studio Workspaces 提供。您仍然可以在旧控制台中使用现有笔记本,但无法在其中创建新笔记本。新控制台中的创建 Workspace 按钮将取代此功能。要访问或创建 Workspaces,EMR Notebooks 用户需要额外的 IAM 角色权限。有关更多信息,请参阅 Amazon EMR Notebooks are Amazon EMR Studio Workspaces in new console(Amazon EMR Notebooks 在新控制台中为 Amazon EMR Studio Workspaces)和 What's new in the console?(控制台中有哪些新功能?)

以下示例使用 EMR Notebooks 控制台中的笔记本演示。要查找笔记本,请使用相对于主目录的文件路径。在此示例中,您可以运行两个笔记本文件:demo_pyspark.ipynbmy_folder/python3.ipynb

文件 demo_pyspark.ipynb 的相对路径是 demo_pyspark.ipynb,如下所示。

python3.ipynb 的相对路径是 my_folder/python3.ipynb,如下所示。

有关 Amazon EMR API NotebookExecution 操作的更多信息,请参阅 Amazon EMR API actions

运行笔记本

您可以使用 Amazon CLI 通过 start-notebook-execution 操作来运行笔记本,示例如下。

例 – 使用(运行在 Amazon EC2 上的)Amazon EMR 集群在 EMR Studio Workspace 中执行 EMR 笔记本。
aws emr --region us-east-1 \ start-notebook-execution \ --editor-id e-ABCDEFG123456 \ --notebook-params '{"input_param":"my-value", "good_superhero":["superman", "batman"]}' \ --relative-path test.ipynb \ --notebook-execution-name my-execution \ --execution-engine '{"Id" : "j-1234ABCD123"}' \ --service-role EMR_Notebooks_DefaultRole { "NotebookExecutionId": "ex-ABCDEFGHIJ1234ABCD" }
例 – 使用 EMR Notebooks 集群在 EMR Studio Workspace 中执行 EMR 笔记本。
aws emr start-notebook-execution \ --region us-east-1 \ --service-role EMR_Notebooks_DefaultRole \ --environment-variables '{"KERNEL_EXTRA_SPARK_OPTS": "--conf spark.executor.instances=1", "KERNEL_LAUNCH_TIMEOUT": "350"}' \ --output-notebook-format HTML \ --execution-engine Id=arn:aws:emr-containers:us-west-2:account-id:/virtualclusters/ABCDEFG/endpoints/ABCDEF,Type=EMR_ON_EKS,ExecutionRoleArn=arn:aws:iam::account-id:role/execution-role \ --editor-id e-ABCDEFG \ --relative-path EMRonEKS-spark_python.ipynb
例 – 执行 EMR 笔记本来指定其 Amazon S3 位置
aws emr start-notebook-execution \ --region us-east-1 \ --notebook-execution-name my-execution-on-emr-on-eks-cluster \ --service-role EMR_Notebooks_DefaultRole \ --environment-variables '{"KERNEL_EXTRA_SPARK_OPTS": "--conf spark.executor.instances=1", "KERNEL_LAUNCH_TIMEOUT": "350"}' \ --output-notebook-format HTML \ --execution-engine Id=arn:aws:emr-containers:us-west-2:account-id:/virtualclusters/ABCDEF/endpoints/ABCDEF,Type=EMR_ON_EKS,ExecutionRoleArn=arn:aws:iam::account-id:role/execution-role \ --notebook-s3-location '{"Bucket": "your-s3-bucket","Key": "s3-prefix-to-notebook-location/EMRonEKS-spark_python.ipynb"}' \ --output-notebook-s3-location '{"Bucket": "your-s3-bucket","Key": "s3-prefix-for-storing-output-notebook"}'

笔记本输出

下面是示例笔记本的输出内容。单元格 3 显示新注入的参数值。

描述笔记本

您可以使用 describe-notebook-execution 操作以访问关于特定笔记本执行的信息。

aws emr --region us-east-1 \ describe-notebook-execution --notebook-execution-id ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE { "NotebookExecution": { "NotebookExecutionId": "ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "ExecutionEngine": { "Id": "j-2QMOV6JAX1TS2", "Type": "EMR", "MasterInstanceSecurityGroupId": "sg-05ce12e58cd4f715e" }, "NotebookExecutionName": "my-execution", "NotebookParams": "{\"input_param\":\"my-value\", \"good_superhero\":[\"superman\", \"batman\"]}", "Status": "FINISHED", "StartTime": 1593490857.009, "Arn": "arn:aws:elasticmapreduce:us-east-1:123456789012:notebook-execution/ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "LastStateChangeReason": "Execution is finished for cluster j-2QMOV6JAX1TS2.", "NotebookInstanceSecurityGroupId": "sg-0683b0a39966d4a6a", "Tags": [] } }

停止运行笔记本

如果您想要停止笔记本正在运行的执行,您可以借助 stop-notebook-execution 命令。

# stop a running execution aws emr --region us-east-1 \ stop-notebook-execution --notebook-execution-id ex-IZWZX78UVPAATC8LHJR129B1RBN4T # describe it aws emr --region us-east-1 \ describe-notebook-execution --notebook-execution-id ex-IZWZX78UVPAATC8LHJR129B1RBN4T { "NotebookExecution": { "NotebookExecutionId": "ex-IZWZX78UVPAATC8LHJR129B1RBN4T", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "ExecutionEngine": { "Id": "j-2QMOV6JAX1TS2", "Type": "EMR" }, "NotebookExecutionName": "my-execution", "NotebookParams": "{\"input_param\":\"my-value\", \"good_superhero\":[\"superman\", \"batman\"]}", "Status": "STOPPED", "StartTime": 1593490876.241, "Arn": "arn:aws:elasticmapreduce:us-east-1:123456789012:editor-execution/ex-IZWZX78UVPAATC8LHJR129B1RBN4T", "LastStateChangeReason": "Execution is stopped for cluster j-2QMOV6JAX1TS2. Internal error", "Tags": [] } }

按开启时间列出笔记本的执行情况

您可以将 --from 参数输入至 list-notebook-executions,按开启时间列出笔记本的执行情况。

# filter by start time aws emr --region us-east-1 \ list-notebook-executions --from 1593400000.000 { "NotebookExecutions": [ { "NotebookExecutionId": "ex-IZWZX78UVPAATC8LHJR129B1RBN4T", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "STOPPED", "StartTime": 1593490876.241 }, { "NotebookExecutionId": "ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "RUNNING", "StartTime": 1593490857.009 }, { "NotebookExecutionId": "ex-IZWZYRS0M14L5V95WZ9OQ399SKMNW", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "STOPPED", "StartTime": 1593490292.995 }, { "NotebookExecutionId": "ex-IZX009ZK83IVY5E33VH8MDMELVK8K", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FINISHED", "StartTime": 1593489834.765 }, { "NotebookExecutionId": "ex-IZWZXOZF88JWDF9J09GJ91R57VI0N", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FAILED", "StartTime": 1593488934.688 } ] }

按开启时间和状态列出笔记本的执行情况

list-notebook-executions 命令也可以采用 --status 参数来筛选结果。

# filter by start time and status aws emr --region us-east-1 \ list-notebook-executions --from 1593400000.000 --status FINISHED { "NotebookExecutions": [ { "NotebookExecutionId": "ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FINISHED", "StartTime": 1593490857.009 }, { "NotebookExecutionId": "ex-IZX009ZK83IVY5E33VH8MDMELVK8K", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FINISHED", "StartTime": 1593489834.765 } ] }