View Persistent Application User Interfaces - Amazon EMR
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

如果我们为英文版本指南提供翻译,那么如果存在任何冲突,将以英文版本指南为准。在提供翻译时使用机器翻译。

View Persistent Application User Interfaces

从 Amazon EMR 版本 5.25.0 开始,您可以使用集群的 Summary (摘要) 页面或控制台中的 Application user interfaces (应用程序用户界面) 选项卡连接到在集群外托管的持久性 Spark 历史记录服务器应用程序的详细信息。从 Amazon EMR 版本 5.30.1 开始,提供了 Tez UI 和 YARN 时间线服务器持久性应用程序界面。对持久性应用程序历史记录的一键式链接访问提供了以下好处:

  • You can quickly analyze and troubleshoot active jobs and job history without setting up a web proxy through an SSH connection.

  • You can access application history and relevant log files for active and terminated clusters. The logs are available for 30 days after the application ends.

应用用户界面 选项卡或群集 摘要 页面中的 Amazon EMR 5.30.1或6.x控制台,选择 纱线时间线服务器TEZUI,或 火花历史服务器 链接。

应用程序 UI 将在新的浏览器选项卡中打开。有关详细信息,请参阅 监控和仪器.

您可以通过 Spark 历史记录服务器、YARN 时间线服务器和 Tez UI 上的链接来查看 YARN 容器日志。

注意

要从 Spark 历史记录服务器、YARN 时间线服务器和 Tez UI 访问 YARN 容器日志,您必须为集群启用 Amazon S3 日志记录。如果未启用日志记录,纱线容器日志的链接将不起作用。

Logs Collection

要启用一键式访问持久性应用程序用户界面,Amazon EMR 需要收集两种类型的日志:

  • Application event logs are collected into an EMR system bucket. The event logs are encrypted at rest using Server-Side Encryption with Amazon S3 Managed Keys (SSE-S3). If you use a private subnet for your cluster, make sure to include “arn:aws:s3:::prod.MyRegion.appinfo.src/*” in the resource list of the Amazon S3 policy for the private subnet. For more information, see Minimum Amazon S3 Policy for Private Subnet.

  • YARN container logs are collected into an Amazon S3 bucket that you own. You must enable logging for your cluster to access YARN container logs. For more information, see Configure Cluster Logging and Debugging.

如果您出于隐私原因需要禁用此功能,则可在创建集群时使用引导脚本来停止守护程序,如以下示例所示。

aws emr create-cluster --name "Stop Application UI Support" --release-label emr-5.30.1 --applications Name=Hadoop Name=Spark --ec2-attributes KeyName=keyname --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=TASK,InstanceCount=1,InstanceType=m3.xlarge --use-default-roles --bootstrap-actions Path=s3://elasticmapreduce/bootstrap-actions/run-if,Args=["instance.isMaster=true","echo Stop Application UI | sudo tee /etc/apppusher/run-apppusher"]

运行此引导脚本后,Amazon EMR Spark 历史记录服务器或 YARN 时间线服务器事件日志收集到 EMR 系统存储桶中。Application user interfaces (应用程序用户界面) 选项卡上没有可用的应用程序历史记录信息,并且您将不再能够从控制台访问所有应用程序用户界面。

Considerations and Limitations

一键式访问持久性应用程序用户界面当前具有以下限制:

  • There will be at least a two-minute delay when the application details show up on the Spark History Server UI.

  • This feature works only when the event log directory for the application is in HDFS. By default, Amazon EMR stores event logs in a directory of HDFS. If you change the default directory to a different file system, such as Amazon S3, this feature will not work.

  • This feature is currently not available for EMR clusters with multiple master nodes or for EMR clusters integrated with AWS Lake Formation.

  • To enable one-click access to persistent application user interfaces, you must have permission to the DescribeCluster action for EMR. If you deny an IAM principal's permission to this action, it takes approximately five minutes for the permission change to propagate.

  • If you reconfigure applications in a running cluster, the application history will be not available through the application UI.

  • For each AWS account, the number of active application UIs cannot exceed 50.

  • You can access application UIs from the console in the US East (N.Virgina and Ohio), US West (N.California and Oregon), Canada (Central), EU (Frankfurt, Ireland, London, and Paris), Asia Pacific (Mumbai, Osaka, Seoul, Singapore, Sydney, and Tokyo), and South America (Sao Paulo) Regions.