Amazon EMR
步骤 1:收集有关此问题的数据

The first step in troubleshooting an cluster is to gather information about what went wrong and the current status and configuration of the cluster. This information will be used in the following steps to confirm or rule out possible causes of the issue.


A clear definition of the problem is the first place to begin. Some questions to ask yourself:

  • What did I expect to happen? What happened instead?

  • When did this problem first occur? How often has it happened since?

  • Has anything changed in how I configure or run my cluster?


以下cluster详细信息有助于追踪问题。要详细了解如何收集这些信息,请参阅 查看集群状态和详细信息

  • cluster的标识符。(也称为任务流程标识符。)

  • 启动cluster的区域和可用区。

  • cluster的状态,包括最后一次状态更改的详细信息。

  • 被指定用作主节点、核心节点和任务节点的 EC2 实例类型和数量。