AWSSupport-SetupIPMonitoringFromVPC
Description
AWSSupport-SetupIPMonitoringFromVPC creates an Amazon Elastic Compute Cloud (Amazon EC2)
instance in the specified subnet and monitors selected target IPs (IPv4 or IPv6) by
continuously running ping, MTR, traceroute and tracetcp tests. The results are
stored in Amazon CloudWatch Logs logs, and metric filters are applied to quickly visualize
latency and packet loss statistics in a CloudWatch dashboard.
Additional Information
The CloudWatch Logs data can be used for network troubleshooting and analysis of pattern/trends. Additionally, you can configure CloudWatch alarms with Amazon SNS notifications when packet loss and/or latency reach a threshold. The data can also be used when opening a case with Amazon Web Services Support, to help isolate an issue quickly and reduce time to resolution when investigating a network issue.
Document type
Automation
Owner
Amazon
Platforms
Linux, macOS, Windows
Parameters
-
AutomationAssumeRole
Type: String
Description: (Optional) The Amazon Resource Name (ARN) of the Amazon Identity and Access Management (IAM) role that allows Systems Manager Automation to perform the actions on your behalf. If no role is specified, Systems Manager Automation uses the permissions of the user that starts this runbook.
-
CloudWatchLogGroupNamePrefix
Type: String
Default:
/AWSSupport-SetupIPMonitoringFromVPCDescription: (Optional) Prefix used for each CloudWatch log group created for the test results.
-
CloudWatchLogGroupRetentionInDays
Type: String
Valid values: 1 | 3 | 5 | 7 | 14 | 30 | 60 | 90 | 120 | 150 | 180 | 365 | 400 | 545 | 731 | 1827 | 3653
Default: 7
Description: (Optional) Number of days you want to keep the network monitoring results for.
-
InstanceType
Type: String
Valid values: t2.micro | t2.small | t2.medium | t2.large | t3.micro | t3.small | t3.medium | t3.large | t4g.micro | t4g.small | t4g.medium | t4g.large
Default: t3.micro
Description: (Optional) The EC2 instance type for the EC2Rescue instance. Recommended size: t3.micro.
-
SubnetId
Type: String
Description: (Required) The subnet ID for the monitor instance. Be aware that if you specify a private subnet, then you must make sure there is Internet access to allow the monitor instance to setup the test (meaning, install the CloudWatch Logs agent, interact with Systems Manager and CloudWatch).
-
TargetIPs
Type: String
Description: (Required) Comma separated list of IPv4s and/or IPv6s to monitor. No spaces allowed. Maximum size is 255 characters. Be aware that if you provide an invalid IP, then the automation will fail and rollback the test setup.
-
TestInstanceSecurityGroupId
Type: String
Description: (Optional) The security group ID for the test instance. If not specified, the automation creates one during the instance creation. Make sure the security group allows outbound access to the monitoring IPs.
-
TestInstanceProfileName
Type: String
Description: (Optional) The name of an existing IAM instance profile for the test instance. If not specified, the automation creates one during the instance creation. The role must have the following permissions:
logs:CreateLogStream,logs:DescribeLogGroups,logs:DescribeLogStreams, andlogs:PutLogEventsand the Amazon Managed PolicyAmazonSSMManagedInstanceCore. -
TestInterval
Type: String
Description: (Optional) The number of minutes between test intervals. The default value is
1minute and the maximum is10minutes. -
RetainDashboardAndLogsOnDeletion
Type: String
Description: (Optional) Specify
Falseto delete the Amazon CloudWatch dashboard and Logs when deleting the Amazon Amazon CloudFormation stack. The default value isTrue. By default, the dashboard and logs are retained and will need to be manually deleted when they are no longer needed.
Required IAM permissions
The AutomationAssumeRole parameter requires the following actions to
use the runbook successfully.
Warning
It is recommended to pass TestInstanceProfileName parameter or
ensure security guardrails in place to prevent misuse of mutable IAM
permissions.
It is recommended that the user who runs the automation have the AmazonSSMAutomationRole IAM managed policy attached. In addition, the user must have the following policy attached to their user account, group, or role:
If the TestInstanceProfileName parameter is provided, the following
IAM permissions are not required to execute the runbook:
-
iam:CreateRole
-
iam:CreateInstanceProfile
-
iam:DetachRolePolicy
-
iam:AttachRolePolicy
-
iam:AddRoleToInstanceProfile
-
iam:RemoveRoleFromInstanceProfile
-
iam:DeleteRole
-
iam:DeleteRolePolicy
-
iam:DeleteInstanceProfile
Document Steps
-
aws:executeAwsApi- describe the provided subnet to get the VPC ID and IPv6 CIDR block association state. -
aws:executeScript- validate the provided target IPs are syntactically correct IPv4 and/or IPv6 addresses, get the architecture of the selected instance type, and verify the subnet has an IPv6 pool association if any target IP is IPv6. -
aws:createStack- create an Amazon CloudFormation stack that provisions the test Amazon EC2 instance, IAM instance profile (if not provided), security group (if not provided), CloudWatch log groups, and CloudWatch dashboard.(Cleanup) If the step fails:
aws:executeScript- describe the Amazon CloudFormation stack events to identify the failure reason.aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:waitForAwsResourceProperty- wait for the Amazon CloudFormation stack to complete creation.(Cleanup) If the step fails:
aws:executeScript- describe the Amazon CloudFormation stack events to identify the failure reason.aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:executeScript- describe the Amazon CloudFormation stack resources to get the test instance ID, security group ID, IAM role, instance profile, and dashboard name.(Cleanup) If the step fails:
aws:executeScript- describe the Amazon CloudFormation stack events to identify the failure reason.aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:waitForAwsResourceProperty- wait for the test instance to become a managed instance.(Cleanup) If the step fails:
aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:runCommand- install the CloudWatch agent on the test instance.(Cleanup) If the step fails:
aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:runCommand- define the network test scripts (MTR, ping, tracepath, and traceroute) for each of the provided IPs.(Cleanup) If the step fails:
aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:runCommand- start the network tests and schedule subsequent executions using cronjobs that run every TestInterval minutes.(Cleanup) If the step fails:
aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:runCommand- configure the CloudWatch agent to push test results from/home/ec2-user/logs/to CloudWatch Logs.(Cleanup) If the step fails:
aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:runCommand- configure log rotation for the test results in/home/ec2-user/logs/. -
aws:executeScript- set the retention policy for all CloudWatch log groups created by the Amazon CloudFormation stack. -
aws:executeScript- create CloudWatch log group metric filters for ping latency and ping packet loss.(Cleanup) If the step fails:
aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:executeScript- update the CloudWatch dashboard to include widgets for ping latency and ping packet loss statistics.(Cleanup) If the step fails:
aws:executeAwsApi- delete the CloudWatch dashboard, if it exists.aws:deleteStack- delete the Amazon CloudFormation stack and all associated resources. -
aws:branch- evaluate the SleepTime parameter. If set to0, the automation ends without deleting the stack. -
aws:sleep- wait for the specified SleepTime duration before deleting the Amazon CloudFormation stack. -
aws:deleteStack- delete the Amazon CloudFormation stack. Based on the RetainDashboardAndLogsOnDeletion parameter, the CloudWatch dashboard and log groups are either retained or deleted.(Cleanup) If the stack deletion fails:
aws:executeScript- describe the Amazon CloudFormation stack events to identify the deletion failure reason.
Outputs
updateCloudWatchDashboard.StackUrl - the URL of the Amazon CloudFormation stack.
updateCloudWatchDashboard.DashboardUrl - the URL of the CloudWatch dashboard.
updateCloudWatchDashboard.DashboardName - the name of the CloudWatch dashboard.
updateCloudWatchDashboard.LogGroups - the list of CloudWatch log groups created.
describeStackResources.HelperInstanceId - the test instance ID.
describeStackResources.StackName - the Amazon CloudFormation stack name.