Amazon EC2 Auto Scaling 问题排查 - Amazon EC2 Auto Scaling
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

Amazon EC2 Auto Scaling 问题排查

Amazon EC2 Auto Scaling 提供特定的描述性错误消息来帮助您排查问题。可以从扩展活动的描述中发现错误消息。

检索来自扩缩活动的错误消息

要从扩展活动描述中检索错误消息,请使用describe-scaling-activities命令。您拥有可追溯到 6 周的扩展活动记录。扩展活动按开始时间排序,首先列出最新的扩展活动。

注意

在 Amazon EC2 Auto Scaling 控制台中,在该 Auto Scaling 组的 Activity(活动)选项卡的活动历史记录中也会显示扩缩活动。

要查看特定 Auto Scaling 组的扩展活动,请使用以下命令。

aws autoscaling describe-scaling-activities --auto-scaling-group-name my-asg

在下面的示例响应中,StatusCode 包含活动的当前状态,StatusMessage 包含错误消息。

{ "Activities": [ { "ActivityId": "3b05dbf6-037c-b92f-133f-38275269dc0f", "AutoScalingGroupName": "my-asg", "Description": "Launching a new EC2 instance: i-003a5b3ffe1e9358e. Status Reason: Instance failed to complete user's Lifecycle Action: Lifecycle Action with token e85eb647-4fe0-4909-b341-a6c42d8aba1f was abandoned: Lifecycle Action Completed with ABANDON Result", "Cause": "At 2021-01-11T00:35:52Z a user request created an AutoScalingGroup changing the desired capacity from 0 to 1. At 2021-01-11T00:35:53Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1.", "StartTime": "2021-01-11T00:35:55.542Z", "EndTime": "2021-01-11T01:06:31Z", "StatusCode": "Cancelled", "StatusMessage": "Instance failed to complete user's Lifecycle Action: Lifecycle Action with token e85eb647-4fe0-4909-b341-a6c42d8aba1f was abandoned: Lifecycle Action Completed with ABANDON Result", "Progress": 100, "Details": "{\"Subnet ID\":\"subnet-5ea0c127\",\"Availability Zone\":\"us-west-2b\"...}", "AutoScalingGroupARN": "arn:aws:autoscaling:us-west-2:123456789012:autoScalingGroup:283179a2-f3ce-423d-93f6-66bb518232f7:autoScalingGroupName/my-asg" }, ... ] }

有关输出中字段的描述,请参阅 Amazon EC2 Auto Scaling API 参考中的活动

要查看已删除组的扩展活动

要在删除 Auto Scaling 组后查看伸缩活动,describe-scaling-activities请在命令中添加以下--include-deleted-groups选项。

aws autoscaling describe-scaling-activities --auto-scaling-group-name my-asg --include-deleted-groups

以下是示例响应,其中包含已删除组的扩展活动。

{ "Activities": [ { "ActivityId": "e1f5de0e-f93e-1417-34ac-092a76fba220", "AutoScalingGroupName": "my-asg", "Description": "Launching a new EC2 instance. Status Reason: Your Spot request price of 0.001 is lower than the minimum required Spot request fulfillment price of 0.0031. Launching EC2 instance failed.", "Cause": "At 2021-01-13T20:47:24Z a user request update of AutoScalingGroup constraints to min: 1, max: 5, desired: 3 changing the desired capacity from 0 to 3. At 2021-01-13T20:47:27Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 3.", "StartTime": "2021-01-13T20:47:30.094Z", "EndTime": "2021-01-13T20:47:30Z", "StatusCode": "Failed", "StatusMessage": "Your Spot request price of 0.001 is lower than the minimum required Spot request fulfillment price of 0.0031. Launching EC2 instance failed.", "Progress": 100, "Details": "{\"Subnet ID\":\"subnet-5ea0c127\",\"Availability Zone\":\"us-west-2b\"...}", "AutoScalingGroupState": "Deleted", "AutoScalingGroupARN": "arn:aws:autoscaling:us-west-2:123456789012:autoScalingGroup:283179a2-f3ce-423d-93f6-66bb518232f7:autoScalingGroupName/my-asg" }, ... ] }

关闭缩放活动

如果您需要在不受扩展策略或计划操作干扰的情况下调查问题,则有以下选择:

  • 通过暂停和ScheduledActions进程,防止所有动态扩展策略和计划操作更改组的所需容量。AlarmNotification有关更多信息,请参阅 暂停和恢复 Amazon EC2 Auto Scaling 流程

  • 禁用单个动态扩展策略,这样它们就不会因负载变化而更改组的所需容量。有关更多信息,请参阅 禁用 Auto Scaling 组的扩缩策略

  • 通过禁用策略的缩减部分,将单个目标跟踪扩展策略更新为仅向外扩展(增加容量)。这种方法可以防止组的所需容量缩小,但允许在负载增加时增加容量。有关更多信息,请参阅 Amazon EC2 Auto Scaling 的目标跟踪扩缩策略

  • 将您的预测扩展策略更新为仅限预测模式。在仅限预测模式下,预测性扩展将继续生成预测,但不会自动增加容量。有关更多信息,请参阅 创建预测性扩展策略

其他故障排除资源

以下页面提供了有关对 Amazon EC2 Auto Scaling 问题进行故障排除的其他信息。

以下 Amazon 资源也可能有所帮助:

故障排除通常需要由专家或多个帮助者进行迭代查询和发现。如果您在尝试本节中的建议后仍然遇到问题,请联系 Amazon Web Services Support (在 “支持” Amazon Web Services Management Console、“支持中心” 中)或使用 Amazon EC2 Auto Scaling 标签在 re Amazon : Pos t 上提问。