

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 使用 Data Lifecycle Manager 自动生成应用程序一致性快照
<a name="automate-app-consistent-backups"></a>

您可以通过在以实例为目标的快照生命周期策略中启用前置和后置脚本，使用 Amazon Data Lifecycle Manager 自动生成应用程序一致性快照。

Amazon Data Lifecycle Manager 与 Amazon Systems Manager （Systems Manager）集成，以支持应用程序一致性快照。Amazon Data Lifecycle Manager 使用 Systems Manager（SSM）命令文档（包括前置和后置脚本）来自动执行完成应用程序一致性快照所需的操作。在 Amazon Data Lifecycle Manager 启动快照创建之前，它会运行预脚本中的命令进行冻结和刷新I/O. After Amazon Data Lifecycle Manager initiates snapshot creation, it runs the commands in the post script to thaw I/O。

使用 Amazon Data Lifecycle Manager，您可以自动生成以下内容的应用程序一致性快照：
+ 使用卷影复制服务（VSS）的 Windows 应用程序
+ SAP HANA 使用 Amazon 托管 SSDM 文档。有关更多信息，请参阅 [Amazon EBS snapshots for SAP HANA](https://docs.amazonaws.cn/sap/latest/sap-hana/ebs-sap-hana.html)。
+ 使用 SSM 文档模板自行管理的数据库，例如 MySQL、PostgreSQL InterSystems 或 IRIS

**Topics**
+ [使用前置和后置脚本的要求](#app-consistent-prereqs)
+ [应用程序一致性快照入门](#app-consistent-get-started)
+ [使用 Amazon Data Lifecycle Manager 进行 VSS 备份的注意事项](#app-consistent-vss)
+ [应用程序一致性快照的共同责任](#shared-responsibility)

## 使用前置和后置脚本的要求
<a name="app-consistent-prereqs"></a>

下表概述了将前置和后置脚本与 Amazon Data Lifecycle Manager 一起使用的要求。


|  | 应用程序一致性快照 |  | 
| --- |--- |--- |
| 要求 | VSS 备份 | 自定义 SSM 文档 | 其他用例 | 
| --- |--- |--- |--- |
| SSM 代理已安装并在目标实例上运行 | ✓ | ✓ | ✓ | 
| 目标实例已满足 VSS 系统要求 | ✓ |  |  | 
| 与目标实例关联的启用 VSS 的实例配置文件 | ✓ |  |  | 
| 安装在目标实例上的 VSS 组件 | ✓ |  |  | 
| 使用脚本前和后置脚本命令准备 SSM 文档 |  | ✓ | ✓ | 
| 准备 Amazon Data Lifecycle Manager IAM 角色运行前和发布脚本 | ✓ | ✓ | ✓ | 
| 创建以实例为目标的快照策略，并针对前脚本和后脚本进行配置 | ✓ | ✓ | ✓ | 

## 应用程序一致性快照入门
<a name="app-consistent-get-started"></a>

本节介绍使用 Amazon Data Lifecycle Manager 自动生成应用程序一致性快照所需遵循的步骤。

### 步骤 1：准备目标实例
<a name="prep-instances"></a>

您需要使用 Amazon Data Lifecycle Manager 为应用程序一致性快照准备目标实例。根据您的用例执行以下操作之一。

------
#### [ Prepare for VSS Backups ]

**为 VSS 备份准备目标实例**

1. 在目标实例上安装 SSM Agent（如果尚未安装）。如果目标实例上已安装 SSM Agent，请跳过此步骤。

   有关更多信息，请参阅[在适用于 Windows Server 的 EC2 实例上使用 SSM Agent](https://docs.amazonaws.cn/systems-manager/latest/userguide/ssm-agent-windows.html)。

1. 确保 SSM Agent 正在运行。有关更多信息，请参阅[正在检查 SSM Agent 状态并启动代理](https://docs.amazonaws.cn/systems-manager/latest/userguide/ssm-agent-status-and-restart.html)。

1. 为 Amazon EC2 实例设置 Systems Manager。有关更多信息，请参阅《Amazon Systems Manager 用户指南》**中的[为 Amazon EC2 实例设置 Systems Manager](https://docs.amazonaws.cn/systems-manager/latest/userguide/systems-manager-setting-up-ec2.html)。

1. [确保满足 VSS 备份的系统要求](https://docs.amazonaws.cn/AWSEC2/latest/UserGuide/application-consistent-snapshots-prereqs.html)。

1. [将启用 VSS 的实例配置文件附加到目标实例](https://docs.amazonaws.cn/AWSEC2/latest/UserGuide/vss-iam-reqs.html)。

1. [安装 VSS 组件](https://docs.amazonaws.cn/AWSEC2/latest/UserGuide/application-consistent-snapshots-getting-started.html)。

------
#### [ Prepare for SAP HANA backups ]

**为 SAP HANA 备份准备目标实例**

1. 在目标实例上准备 SAP HANA 环境。

   1. 使用 SAP HANA 设置实例。如果您还没有现成的 SAP HANA 环境，则可以参考 [SAP HANA Environment Setup on Amazon](https://docs.amazonaws.cn/sap/latest/sap-hana/std-sap-hana-environment-setup.html)。

   1. 以合适的管理员用户身份登录 SystemDB。

   1. 创建要与 Amazon Data Lifecycle Manager 一起使用的数据库备份用户。

      ```
      CREATE USER username PASSWORD password NO FORCE_FIRST_PASSWORD_CHANGE;
      ```

      例如，以下命令创建了一个名为 `dlm_user` 并且密码为 `password` 的用户。

      ```
      CREATE USER dlm_user PASSWORD password NO FORCE_FIRST_PASSWORD_CHANGE;
      ```

   1. 将 `BACKUP OPERATOR` 角色分配给您在上一步中创建的数据库备份用户。

      ```
      GRANT BACKUP OPERATOR TO username
      ```

      例如，以下命令将角色分配给名为 `dlm_user` 的用户。

      ```
      GRANT BACKUP OPERATOR TO dlm_user
      ```

   1. 以管理员身份登录操作系统，例如 `sidadm`。

   1. 创建一个 `hdbuserstore` 条目来存储连接信息，这样 SAP HANA SSM 文档就可以连接到 SAP HANA，而无需用户输入信息。

      ```
      hdbuserstore set DLM_HANADB_SNAPSHOT_USER localhost:3hana_instance_number13 username password
      ```

      例如：

      ```
      hdbuserstore set DLM_HANADB_SNAPSHOT_USER localhost:30013 dlm_user password
      ```

   1. 测试连接。

      ```
      hdbsql -U DLM_HANADB_SNAPSHOT_USER "select * from dummy"
      ```

1. 在目标实例上安装 SSM Agent（如果尚未安装）。如果目标实例上已安装 SSM Agent，请跳过此步骤。

   有关更多信息，请参阅[在 Linux EC2 实例上手动安装 SSM Agent](https://docs.amazonaws.cn/systems-manager/latest/userguide/manually-install-ssm-agent-linux.html)。

1. 确保 SSM Agent 正在运行。有关更多信息，请参阅[正在检查 SSM Agent 状态并启动代理](https://docs.amazonaws.cn/systems-manager/latest/userguide/ssm-agent-status-and-restart.html)。

1. 为 Amazon EC2 实例设置 Systems Manager。有关更多信息，请参阅《Amazon Systems Manager 用户指南》**中的[为 Amazon EC2 实例设置 Systems Manager](https://docs.amazonaws.cn/systems-manager/latest/userguide/systems-manager-setting-up-ec2.html)。

------
#### [ Prepare for custom SSM documents ]

**为 SSM 文档准备目标实例**

1. 在目标实例上安装 SSM Agent（如果尚未安装）。如果目标实例上已安装 SSM Agent，请跳过此步骤。
   + （Linux 实例）[在适用于 Linux 的 EC2 实例上手动安装 SSM Agent](https://docs.amazonaws.cn/systems-manager/latest/userguide/manually-install-ssm-agent-linux.html)
   + （Windows 实例）[在适用于 Windows 的 EC2 实例上手动安装 SSM Agent](https://docs.amazonaws.cn/systems-manager/latest/userguide/ssm-agent-windows.html)

1. 确保 SSM Agent 正在运行。有关更多信息，请参阅[正在检查 SSM Agent 状态并启动代理](https://docs.amazonaws.cn/systems-manager/latest/userguide/ssm-agent-status-and-restart.html)。

1. 为 Amazon EC2 实例设置 Systems Manager。有关更多信息，请参阅《Amazon Systems Manager 用户指南》**中的[为 Amazon EC2 实例设置 Systems Manager](https://docs.amazonaws.cn/systems-manager/latest/userguide/systems-manager-setting-up-ec2.html)。

------

### 步骤 2：准备 SSM 文档
<a name="prep-ssm-doc"></a>

**注意**  
只有自定义 SSM 文档才需要执行此步骤。VSS 备份或 SAP HANA 不需要执行此步骤。对于 VSS 备份和 SAP HANA，Amazon Data Lifecycle Manager 使用 Amazon 托管 SSM 文档。

如果要为自管理的数据库（例如 MySQL、PostgreSQL 或 InterSystems IRIS）自动生成应用程序一致性快照，则必须创建一个 SSM 命令文档，其中包含用于在启动快照创建 I/O 之前冻结和刷新的预脚本，以及启动快照创建后解冻的后置脚本。 I/O 

如果您的 MySQL、PostgreSQL InterSystems 或 IRIS 数据库使用标准配置，则可以使用下面的示例 SSM 文档内容创建 SSM 命令文档。如果您的 MySQL、PostgreSQL InterSystems 或 IRIS 数据库使用非标准配置，则可以使用以下示例内容作为 SSM 命令文档的起点，然后对其进行自定义以满足您的要求。或者，如果想要从头开始创建新的 SSM 文档，则可以使用下面的 SSM 文档空白模板，并在相应的文档部分中添加前置和后置命令。

**注意以下几点：**  
您负责确保 SSM 文档为数据库配置执行正确且必需的操作。
只有当 SSM 文档中的前置和后置脚本能够成功冻结、刷新和解冻 I/O 时，才能保证快照具有应用程序一致性。
SSM 文档必须包含 `allowedValues` 的必填字段，包括 `pre-script`、`post-script` 和 `dry-run`。Amazon Data Lifecycle Manager 将根据这些部分的内容在您的实例上执行命令。如果您的 SSM 文档没有这些部分，则 Amazon Data Lifecycle Manager 会将其视为执行失败。

------
#### [ MySQL sample document content ]

```
###===============================================================================###
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

# Permission is hereby granted, free of charge, to any person obtaining a copy of this
# software and associated documentation files (the "Software"), to deal in the Software
# without restriction, including without limitation the rights to use, copy, modify,
# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
###===============================================================================###
schemaVersion: '2.2'
description: Amazon Data Lifecycle Manager Pre/Post script for MySQL databases
parameters:
  executionId:
    type: String
    default: None
    description: (Required) Specifies the unique identifier associated with a pre and/or post execution
    allowedPattern: ^(None|[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12})$
  command:
  # Data Lifecycle Manager will trigger the pre-script and post-script actions during policy execution. 
  # 'dry-run' option is intended for validating the document execution without triggering any commands
  # on the instance. The following allowedValues will allow Data Lifecycle Manager to successfully 
  # trigger pre and post script actions.
    type: String
    default: 'dry-run'
    description: (Required) Specifies whether pre-script and/or post-script should be executed.
    allowedValues:
    - pre-script
    - post-script
    - dry-run

mainSteps:
- action: aws:runShellScript
  description: Run MySQL Database freeze/thaw commands
  name: run_pre_post_scripts
  precondition:
    StringEquals:
    - platformType
    - Linux
  inputs:
    runCommand:
    - |
      #!/bin/bash

      ###===============================================================================###
      ### Error Codes
      ###===============================================================================###
      # The following Error codes will inform Data Lifecycle Manager of the type of error 
      # and help guide handling of the error. 
      # The Error code will also be emitted via AWS Eventbridge events in the 'cause' field.
      # 1 Pre-script failed during execution - 201
      # 2 Post-script failed during execution - 202
      # 3 Auto thaw occurred before post-script was initiated - 203
      # 4 Pre-script initiated while post-script was expected - 204
      # 5 Post-script initiated while pre-script was expected - 205
      # 6 Application not ready for pre or post-script initiation - 206

      ###=================================================================###
      ### Global variables
      ###=================================================================###
      START=$(date +%s)
      # For testing this script locally, replace the below with OPERATION=$1.
      OPERATION={{ command }}
      FS_ALREADY_FROZEN_ERROR='freeze failed: Device or resource busy'
      FS_ALREADY_THAWED_ERROR='unfreeze failed: Invalid argument'
      FS_BUSY_ERROR='mount point is busy'

      # Auto thaw is a fail safe mechanism to automatically unfreeze the application after the 
      # duration specified in the global variable below. Choose the duration based on your
      # database application's tolerance to freeze.
      export AUTO_THAW_DURATION_SECS="60"

      # Add all pre-script actions to be performed within the function below
      execute_pre_script() {
          echo "INFO: Start execution of pre-script"
          # Check if filesystem is already frozen. No error code indicates that filesystem 
          # is not currently frozen and that the pre-script can proceed with freezing the filesystem.
          check_fs_freeze
          # Execute the DB commands to flush the DB in preparation for snapshot
          snap_db
          # Freeze the filesystem. No error code indicates that filesystem was succefully frozen
          freeze_fs

          echo "INFO: Schedule Auto Thaw to execute in ${AUTO_THAW_DURATION_SECS} seconds."
          $(nohup bash -c execute_schedule_auto_thaw  >/dev/null 2>&1 &)
      }

      # Add all post-script actions to be performed within the function below
      execute_post_script() {
          echo "INFO: Start execution of post-script"
          # Unfreeze the filesystem. No error code indicates that filesystem was successfully unfrozen.
          unfreeze_fs
          thaw_db
      }

      # Execute Auto Thaw to automatically unfreeze the application after the duration configured 
      # in the AUTO_THAW_DURATION_SECS global variable.
      execute_schedule_auto_thaw() {
          sleep ${AUTO_THAW_DURATION_SECS}
          execute_post_script
      }

      # Disable Auto Thaw if it is still enabled
      execute_disable_auto_thaw() {
          echo "INFO: Attempting to disable auto thaw if enabled"
          auto_thaw_pgid=$(pgrep -f execute_schedule_auto_thaw | xargs -i ps -hp {} -o pgid)
          if [ -n "${auto_thaw_pgid}" ]; then
              echo "INFO: execute_schedule_auto_thaw process found with pgid ${auto_thaw_pgid}"
              sudo pkill -g ${auto_thaw_pgid}
              rc=$?
              if [ ${rc} != 0 ]; then
                  echo "ERROR: Unable to kill execute_schedule_auto_thaw process. retval=${rc}"
              else
                  echo "INFO: Auto Thaw  has been disabled"
              fi
          fi
      }

      # Iterate over all the mountpoints and check if filesystem is already in freeze state.
      # Return error code 204 if any of the mount points are already frozen.
      check_fs_freeze() {
          for target in $(lsblk -nlo MOUNTPOINTS)
          do
              # Freeze of the root and boot filesystems is dangerous and pre-script does not freeze these filesystems.
              # Hence, we will skip the root and boot mountpoints while checking if filesystem is in freeze state.
              if [ $target == '/' ]; then continue; fi
              if [[ "$target" == *"/boot"* ]]; then continue; fi

              error_message=$(sudo mount -o remount,noatime $target 2>&1)
              # Remount will be a no-op without a error message if the filesystem is unfrozen.
              # However, if filesystem is already frozen, remount will fail with busy error message.
              if [ $? -ne 0 ];then
                  # If the filesystem is already in frozen, return error code 204
                  if [[ "$error_message" == *"$FS_BUSY_ERROR"* ]];then
                      echo "ERROR: Filesystem ${target} already frozen. Return Error Code: 204"
                      exit 204
                  fi
                  # If the check filesystem freeze failed due to any reason other than the filesystem already frozen, return 201
                  echo "ERROR: Failed to check_fs_freeze on mountpoint $target due to error - $errormessage"
                  exit 201
              fi
          done
      } 

      # Iterate over all the mountpoints and freeze the filesystem.
      freeze_fs() {
          for target in $(lsblk -nlo MOUNTPOINTS)
          do
              # Freeze of the root and boot filesystems is dangerous. Hence, skip filesystem freeze 
              # operations for root and boot mountpoints.
              if [ $target == '/' ]; then continue; fi
              if [[ "$target" == *"/boot"* ]]; then continue; fi
              echo "INFO: Freezing $target"
              error_message=$(sudo fsfreeze -f $target 2>&1)
              if [ $? -ne 0 ];then
                  # If the filesystem is already in frozen, return error code 204
                  if [[ "$error_message" == *"$FS_ALREADY_FROZEN_ERROR"* ]]; then
                      echo "ERROR: Filesystem ${target} already frozen. Return Error Code: 204"
                      sudo mysql -e 'UNLOCK TABLES;'
                      exit 204
                  fi
                  # If the filesystem freeze failed due to any reason other than the filesystem already frozen, return 201
                  echo "ERROR: Failed to freeze mountpoint $targetdue due to error - $errormessage"
                  thaw_db
                  exit 201
              fi
              echo "INFO: Freezing complete on $target"
          done
      }

      # Iterate over all the mountpoints and unfreeze the filesystem.
      unfreeze_fs() {
          for target in $(lsblk -nlo MOUNTPOINTS)
          do
              # Freeze of the root and boot filesystems is dangerous and pre-script does not freeze these filesystems.
              # Hence, will skip the root and boot mountpoints during unfreeze as well.
              if [ $target == '/' ]; then continue; fi
              if [[ "$target" == *"/boot"* ]]; then continue; fi
              echo "INFO: Thawing $target"
              error_message=$(sudo fsfreeze -u $target 2>&1)
              # Check if filesystem is already unfrozen (thawed). Return error code 204 if filesystem is already unfrozen.
              if [ $? -ne 0 ]; then
                  if [[ "$error_message" == *"$FS_ALREADY_THAWED_ERROR"* ]]; then
                      echo "ERROR: Filesystem ${target} is already in thaw state. Return Error Code: 205"
                      exit 205
                  fi
                  # If the filesystem unfreeze failed due to any reason other than the filesystem already unfrozen, return 202
                  echo "ERROR: Failed to unfreeze mountpoint $targetdue due to error - $errormessage"
                  exit 202
              fi
              echo "INFO: Thaw complete on $target"
          done    
      }

      snap_db() {
          # Run the flush command only when MySQL DB service is up and running
          sudo systemctl is-active --quiet mysqld.service
          if [ $? -eq 0 ]; then
              echo "INFO: Execute MySQL Flush and Lock command."
              sudo mysql -e 'FLUSH TABLES WITH READ LOCK;'
              # If the MySQL Flush and Lock command did not succeed, return error code 201 to indicate pre-script failure
              if [ $? -ne 0 ]; then
                  echo "ERROR: MySQL FLUSH TABLES WITH READ LOCK command failed."
                  exit 201
              fi
              sync
          else 
              echo "INFO: MySQL service is inactive. Skipping execution of MySQL Flush and Lock command."
          fi
      }

      thaw_db() {
          # Run the unlock command only when MySQL DB service is up and running
          sudo systemctl is-active --quiet mysqld.service
          if [ $? -eq 0 ]; then
              echo "INFO: Execute MySQL Unlock"
              sudo mysql -e 'UNLOCK TABLES;'
          else 
              echo "INFO: MySQL service is inactive. Skipping execution of MySQL Unlock command."
          fi
      }

      export -f execute_schedule_auto_thaw
      export -f execute_post_script
      export -f unfreeze_fs
      export -f thaw_db

      # Debug logging for parameters passed to the SSM document
      echo "INFO: ${OPERATION} starting at $(date) with executionId: ${EXECUTION_ID}"

      # Based on the command parameter value execute the function that supports 
      # pre-script/post-script operation
      case ${OPERATION} in
          pre-script)
              execute_pre_script
              ;;
          post-script)
              execute_post_script
              execute_disable_auto_thaw
              ;;
          dry-run)
              echo "INFO: dry-run option invoked - taking no action"
              ;;
          *)
              echo "ERROR: Invalid command parameter passed. Please use either pre-script, post-script, dry-run."
              exit 1 # return failure
              ;;
      esac

      END=$(date +%s)
      # Debug Log for profiling the script time
      echo "INFO: ${OPERATION} completed at $(date). Total runtime: $((${END} - ${START})) seconds."
```

------
#### [ PostgreSQL sample document content ]

```
###===============================================================================###
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

# Permission is hereby granted, free of charge, to any person obtaining a copy of this
# software and associated documentation files (the "Software"), to deal in the Software
# without restriction, including without limitation the rights to use, copy, modify,
# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
###===============================================================================###
schemaVersion: '2.2'
description: Amazon Data Lifecycle Manager Pre/Post script for PostgreSQL databases
parameters:
  executionId:
    type: String
    default: None
    description: (Required) Specifies the unique identifier associated with a pre and/or post execution
    allowedPattern: ^(None|[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12})$
  command:
  # Data Lifecycle Manager will trigger the pre-script and post-script actions during policy execution. 
  # 'dry-run' option is intended for validating the document execution without triggering any commands
  # on the instance. The following allowedValues will allow Data Lifecycle Manager to successfully 
  # trigger pre and post script actions.
    type: String
    default: 'dry-run'
    description: (Required) Specifies whether pre-script and/or post-script should be executed.
    allowedValues:
    - pre-script
    - post-script
    - dry-run

mainSteps:
- action: aws:runShellScript
  description: Run PostgreSQL Database freeze/thaw commands
  name: run_pre_post_scripts
  precondition:
    StringEquals:
    - platformType
    - Linux
  inputs:
    runCommand:
    - |
      #!/bin/bash

      ###===============================================================================###
      ### Error Codes
      ###===============================================================================###
      # The following Error codes will inform Data Lifecycle Manager of the type of error 
      # and help guide handling of the error. 
      # The Error code will also be emitted via AWS Eventbridge events in the 'cause' field.
      # 1 Pre-script failed during execution - 201
      # 2 Post-script failed during execution - 202
      # 3 Auto thaw occurred before post-script was initiated - 203
      # 4 Pre-script initiated while post-script was expected - 204
      # 5 Post-script initiated while pre-script was expected - 205
      # 6 Application not ready for pre or post-script initiation - 206

      ###===============================================================================###
      ### Global variables
      ###===============================================================================###
      START=$(date +%s)
      OPERATION={{ command }}
      FS_ALREADY_FROZEN_ERROR='freeze failed: Device or resource busy'
      FS_ALREADY_THAWED_ERROR='unfreeze failed: Invalid argument'
      FS_BUSY_ERROR='mount point is busy'

      # Auto thaw is a fail safe mechanism to automatically unfreeze the application after the 
      # duration specified in the global variable below. Choose the duration based on your
      # database application's tolerance to freeze.
      export AUTO_THAW_DURATION_SECS="60"

      # Add all pre-script actions to be performed within the function below
      execute_pre_script() {
          echo "INFO: Start execution of pre-script"
          # Check if filesystem is already frozen. No error code indicates that filesystem 
          # is not currently frozen and that the pre-script can proceed with freezing the filesystem.
          check_fs_freeze
          # Execute the DB commands to flush the DB in preparation for snapshot
          snap_db
          # Freeze the filesystem. No error code indicates that filesystem was succefully frozen
          freeze_fs

          echo "INFO: Schedule Auto Thaw to execute in ${AUTO_THAW_DURATION_SECS} seconds."
          $(nohup bash -c execute_schedule_auto_thaw  >/dev/null 2>&1 &)
      }

      # Add all post-script actions to be performed within the function below
      execute_post_script() {
          echo "INFO: Start execution of post-script"
          # Unfreeze the filesystem. No error code indicates that filesystem was successfully unfrozen
          unfreeze_fs
      }

      # Execute Auto Thaw to automatically unfreeze the application after the duration configured 
      # in the AUTO_THAW_DURATION_SECS global variable.
      execute_schedule_auto_thaw() {
          sleep ${AUTO_THAW_DURATION_SECS}
          execute_post_script
      }

      # Disable Auto Thaw if it is still enabled
      execute_disable_auto_thaw() {
          echo "INFO: Attempting to disable auto thaw if enabled"
          auto_thaw_pgid=$(pgrep -f execute_schedule_auto_thaw | xargs -i ps -hp {} -o pgid)
          if [ -n "${auto_thaw_pgid}" ]; then
              echo "INFO: execute_schedule_auto_thaw process found with pgid ${auto_thaw_pgid}"
              sudo pkill -g ${auto_thaw_pgid}
              rc=$?
              if [ ${rc} != 0 ]; then
                  echo "ERROR: Unable to kill execute_schedule_auto_thaw process. retval=${rc}"
              else
                  echo "INFO: Auto Thaw  has been disabled"
              fi
          fi
      }

      # Iterate over all the mountpoints and check if filesystem is already in freeze state.
      # Return error code 204 if any of the mount points are already frozen.
      check_fs_freeze() {
          for target in $(lsblk -nlo MOUNTPOINTS)
          do
              # Freeze of the root and boot filesystems is dangerous and pre-script does not freeze these filesystems.
              # Hence, we will skip the root and boot mountpoints while checking if filesystem is in freeze state.
              if [ $target == '/' ]; then continue; fi
              if [[ "$target" == *"/boot"* ]]; then continue; fi

              error_message=$(sudo mount -o remount,noatime $target 2>&1)
              # Remount will be a no-op without a error message if the filesystem is unfrozen.
              # However, if filesystem is already frozen, remount will fail with busy error message.
              if [ $? -ne 0 ];then
                  # If the filesystem is already in frozen, return error code 204
                  if [[ "$error_message" == *"$FS_BUSY_ERROR"* ]];then
                      echo "ERROR: Filesystem ${target} already frozen. Return Error Code: 204"
                      exit 204
                  fi
                  # If the check filesystem freeze failed due to any reason other than the filesystem already frozen, return 201
                  echo "ERROR: Failed to check_fs_freeze on mountpoint $target due to error - $errormessage"
                  exit 201
              fi
          done
      } 

      # Iterate over all the mountpoints and freeze the filesystem.
      freeze_fs() {
          for target in $(lsblk -nlo MOUNTPOINTS)
          do
              # Freeze of the root and boot filesystems is dangerous. Hence, skip filesystem freeze 
              # operations for root and boot mountpoints.
              if [ $target == '/' ]; then continue; fi
              if [[ "$target" == *"/boot"* ]]; then continue; fi
              echo "INFO: Freezing $target"
              error_message=$(sudo fsfreeze -f $target 2>&1)
              if [ $? -ne 0 ];then
                  # If the filesystem is already in frozen, return error code 204
                  if [[ "$error_message" == *"$FS_ALREADY_FROZEN_ERROR"* ]]; then
                      echo "ERROR: Filesystem ${target} already frozen. Return Error Code: 204"
                      exit 204
                  fi
                  # If the filesystem freeze failed due to any reason other than the filesystem already frozen, return 201
                  echo "ERROR: Failed to freeze mountpoint $targetdue due to error - $errormessage"
                  exit 201
              fi
              echo "INFO: Freezing complete on $target"
          done
      }

      # Iterate over all the mountpoints and unfreeze the filesystem.
      unfreeze_fs() {
          for target in $(lsblk -nlo MOUNTPOINTS)
          do
              # Freeze of the root and boot filesystems is dangerous and pre-script does not freeze these filesystems.
              # Hence, will skip the root and boot mountpoints during unfreeze as well.
              if [ $target == '/' ]; then continue; fi
              if [[ "$target" == *"/boot"* ]]; then continue; fi
              echo "INFO: Thawing $target"
              error_message=$(sudo fsfreeze -u $target 2>&1)
              # Check if filesystem is already unfrozen (thawed). Return error code 204 if filesystem is already unfrozen.
              if [ $? -ne 0 ]; then
                  if [[ "$error_message" == *"$FS_ALREADY_THAWED_ERROR"* ]]; then
                      echo "ERROR: Filesystem ${target} is already in thaw state. Return Error Code: 205"
                      exit 205
                  fi
                  # If the filesystem unfreeze failed due to any reason other than the filesystem already unfrozen, return 202
                  echo "ERROR: Failed to unfreeze mountpoint $targetdue due to error - $errormessage"
                  exit 202
              fi
              echo "INFO: Thaw complete on $target"
          done
      }

      snap_db() {
          # Run the flush command only when PostgreSQL DB service is up and running
          sudo systemctl is-active --quiet postgresql
          if [ $? -eq 0 ]; then
              echo "INFO: Execute Postgres CHECKPOINT"
              # PostgreSQL command to flush the transactions in memory to disk
              sudo -u postgres psql -c 'CHECKPOINT;'
              # If the PostgreSQL Command did not succeed, return error code 201 to indicate pre-script failure
              if [ $? -ne 0 ]; then
                  echo "ERROR: Postgres CHECKPOINT command failed."
                  exit 201
              fi
              sync
          else 
              echo "INFO: PostgreSQL service is inactive. Skipping execution of CHECKPOINT command."
          fi
      }

      export -f execute_schedule_auto_thaw
      export -f execute_post_script
      export -f unfreeze_fs

      # Debug logging for parameters passed to the SSM document
      echo "INFO: ${OPERATION} starting at $(date) with executionId: ${EXECUTION_ID}"

      # Based on the command parameter value execute the function that supports 
      # pre-script/post-script operation
      case ${OPERATION} in
          pre-script)
              execute_pre_script
              ;;
          post-script)
              execute_post_script
              execute_disable_auto_thaw
              ;;
          dry-run)
              echo "INFO: dry-run option invoked - taking no action"
              ;;
          *)
              echo "ERROR: Invalid command parameter passed. Please use either pre-script, post-script, dry-run."
              exit 1 # return failure
              ;;
      esac

      END=$(date +%s)
      # Debug Log for profiling the script time
      echo "INFO: ${OPERATION} completed at $(date). Total runtime: $((${END} - ${START})) seconds."
```

------
#### [ InterSystems IRIS sample document content ]

```
###===============================================================================###
# MIT License
# 
# Copyright (c) 2024 InterSystems
# 
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
###===============================================================================###
schemaVersion: '2.2'
description: SSM Document Template for Amazon Data Lifecycle Manager Pre/Post script feature for InterSystems IRIS.
parameters:
  executionId:
    type: String
    default: None
    description: Specifies the unique identifier associated with a pre and/or post execution
    allowedPattern: ^(None|[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12})$
  command:
    type: String
    # Data Lifecycle Manager will trigger the pre-script and post-script actions. You can also use this SSM document with 'dry-run' for manual testing purposes.
    default: 'dry-run'
    description: (Required) Specifies whether pre-script and/or post-script should be executed.
    #The following allowedValues will allow Data Lifecycle Manager to successfully trigger pre and post script actions.
    allowedValues:
    - pre-script
    - post-script
    - dry-run

mainSteps:
- action: aws:runShellScript
  description: Run InterSystems IRIS Database freeze/thaw commands
  name: run_pre_post_scripts
  precondition:
    StringEquals:
    - platformType
    - Linux
  inputs:
    runCommand:
    - |
      #!/bin/bash
      ###===============================================================================###
      ### Global variables
      ###===============================================================================###
      DOCKER_NAME=iris
      LOGDIR=./
      EXIT_CODE=0
      OPERATION={{ command }}
      START=$(date +%s)
      
      # Check if Docker is installed
      # By default if Docker is present, script assumes that InterSystems IRIS is running in Docker
      # Leave only the else block DOCKER_EXEC line, if you run InterSystems IRIS non-containerised (and Docker is present).
      # Script assumes irissys user has OS auth enabled, change the OS user or supply login/password depending on your configuration.
      if command -v docker &> /dev/null
      then
        DOCKER_EXEC="docker exec $DOCKER_NAME"
      else
        DOCKER_EXEC="sudo -i -u irissys"
      fi
      
                    
      # Add all pre-script actions to be performed within the function below
      execute_pre_script() {
        echo "INFO: Start execution of pre-script"
        
        # find all iris running instances
        iris_instances=$($DOCKER_EXEC iris qall 2>/dev/null | tail -n +3 | grep '^up' | cut -c5-  | awk '{print $1}')
        echo "`date`: Running iris instances $iris_instances"
      
        # Only for running instances
        for INST in $iris_instances; do
      
          echo "`date`: Attempting to freeze $INST"
      
          # Detailed instances specific log
          LOGFILE=$LOGDIR/$INST-pre_post.log
          
          #check Freeze status before starting
          $DOCKER_EXEC irissession $INST -U '%SYS' "##Class(Backup.General).IsWDSuspendedExt()"
          freeze_status=$?
          if [ $freeze_status -eq 5 ]; then
            echo "`date`:   ERROR: $INST IS already FROZEN"
            EXIT_CODE=204
          else
            echo "`date`:   $INST is not frozen"
            # Freeze
            # Docs: https://docs.intersystems.com/irislatest/csp/documatic/%25CSP.Documatic.cls?LIBRARY=%25SYS&CLASSNAME=Backup.General#ExternalFreeze
            $DOCKER_EXEC irissession $INST -U '%SYS' "##Class(Backup.General).ExternalFreeze(\"$LOGFILE\",,,,,,600,,,300)"
            status=$?
      
            case $status in
              5) echo "`date`:   $INST IS FROZEN"
                ;;
              3) echo "`date`:   $INST FREEZE FAILED"
                EXIT_CODE=201
                ;;
              *) echo "`date`:   ERROR: Unknown status code: $status"
                EXIT_CODE=201
                ;;
            esac
            echo "`date`:   Completed freeze of $INST"
          fi
        done
        echo "`date`: Pre freeze script finished"
      }
                    
      # Add all post-script actions to be performed within the function below
      execute_post_script() {
        echo "INFO: Start execution of post-script"
      
        # find all iris running instances
        iris_instances=$($DOCKER_EXEC iris qall 2>/dev/null | tail -n +3 | grep '^up' | cut -c5-  | awk '{print $1}')
        echo "`date`: Running iris instances $iris_instances"
      
        # Only for running instances
        for INST in $iris_instances; do
      
          echo "`date`: Attempting to thaw $INST"
      
          # Detailed instances specific log
          LOGFILE=$LOGDIR/$INST-pre_post.log
      
          #check Freeze status befor starting
          $DOCKER_EXEC irissession $INST -U '%SYS' "##Class(Backup.General).IsWDSuspendedExt()"
          freeze_status=$?
          if [ $freeze_status -eq 5 ]; then
            echo "`date`:  $INST is in frozen state"
            # Thaw
            # Docs: https://docs.intersystems.com/irislatest/csp/documatic/%25CSP.Documatic.cls?LIBRARY=%25SYS&CLASSNAME=Backup.General#ExternalFreeze
            $DOCKER_EXEC irissession $INST -U%SYS "##Class(Backup.General).ExternalThaw(\"$LOGFILE\")"
            status=$?
      
            case $status in
              5) echo "`date`:   $INST IS THAWED"
                  $DOCKER_EXEC irissession $INST -U%SYS "##Class(Backup.General).ExternalSetHistory(\"$LOGFILE\")"
                ;;
              3) echo "`date`:   $INST THAW FAILED"
                  EXIT_CODE=202
                ;;
              *) echo "`date`:   ERROR: Unknown status code: $status"
                  EXIT_CODE=202
                ;;
            esac
            echo "`date`:   Completed thaw of $INST"
          else
            echo "`date`:   ERROR: $INST IS already THAWED"
            EXIT_CODE=205
          fi
        done
        echo "`date`: Post thaw script finished"
      }
      
      # Debug logging for parameters passed to the SSM document
        echo "INFO: ${OPERATION} starting at $(date) with executionId: ${EXECUTION_ID}"
                    
      # Based on the command parameter value execute the function that supports 
      # pre-script/post-script operation
      case ${OPERATION} in
        pre-script)
          execute_pre_script
          ;;
        post-script)
          execute_post_script
            ;;
        dry-run)
          echo "INFO: dry-run option invoked - taking no action"
          ;;
        *)
          echo "ERROR: Invalid command parameter passed. Please use either pre-script, post-script, dry-run."
          # return failure
          EXIT_CODE=1
          ;;
      esac
                    
      END=$(date +%s)
      # Debug Log for profiling the script time
      echo "INFO: ${OPERATION} completed at $(date). Total runtime: $((${END} - ${START})) seconds."
      exit $EXIT_CODE
```

有关更多信息，请参阅[GitHub 存储库](https://github.com/intersystems-community/aws/blob/master/README.md)。

------
#### [ Empty document template ]

```
###===============================================================================###
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

# Permission is hereby granted, free of charge, to any person obtaining a copy of this
# software and associated documentation files (the "Software"), to deal in the Software
# without restriction, including without limitation the rights to use, copy, modify,
# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
###===============================================================================###
schemaVersion: '2.2'
description: SSM Document Template for Amazon Data Lifecycle Manager Pre/Post script feature
parameters:
  executionId:
    type: String
    default: None
    description: (Required) Specifies the unique identifier associated with a pre and/or post execution
    allowedPattern: ^(None|[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12})$
  command:
  # Data Lifecycle Manager will trigger the pre-script and post-script actions during policy execution. 
  # 'dry-run' option is intended for validating the document execution without triggering any commands
  # on the instance. The following allowedValues will allow Data Lifecycle Manager to successfully 
  # trigger pre and post script actions.
    type: String
    default: 'dry-run'
    description: (Required) Specifies whether pre-script and/or post-script should be executed.
    allowedValues:
    - pre-script
    - post-script
    - dry-run

mainSteps:
- action: aws:runShellScript
  description: Run Database freeze/thaw commands
  name: run_pre_post_scripts
  precondition:
    StringEquals:
    - platformType
    - Linux
  inputs:
    runCommand:
    - |
      #!/bin/bash

      ###===============================================================================###
      ### Error Codes
      ###===============================================================================###
      # The following Error codes will inform Data Lifecycle Manager of the type of error 
      # and help guide handling of the error. 
      # The Error code will also be emitted via AWS Eventbridge events in the 'cause' field.
      # 1 Pre-script failed during execution - 201
      # 2 Post-script failed during execution - 202
      # 3 Auto thaw occurred before post-script was initiated - 203
      # 4 Pre-script initiated while post-script was expected - 204
      # 5 Post-script initiated while pre-script was expected - 205
      # 6 Application not ready for pre or post-script initiation - 206

      ###===============================================================================###
      ### Global variables
      ###===============================================================================###
      START=$(date +%s)
      # For testing this script locally, replace the below with OPERATION=$1.
      OPERATION={{ command }}

      # Add all pre-script actions to be performed within the function below
      execute_pre_script() {
          echo "INFO: Start execution of pre-script"
      }

      # Add all post-script actions to be performed within the function below
      execute_post_script() {
          echo "INFO: Start execution of post-script"
      }

      # Debug logging for parameters passed to the SSM document
      echo "INFO: ${OPERATION} starting at $(date) with executionId: ${EXECUTION_ID}"

      # Based on the command parameter value execute the function that supports 
      # pre-script/post-script operation
      case ${OPERATION} in
          pre-script)
              execute_pre_script
              ;;
          post-script)
              execute_post_script
              ;;
          dry-run)
              echo "INFO: dry-run option invoked - taking no action"
              ;;
          *)
              echo "ERROR: Invalid command parameter passed. Please use either pre-script, post-script, dry-run."
              exit 1 # return failure
              ;;
      esac

      END=$(date +%s)
      # Debug Log for profiling the script time
      echo "INFO: ${OPERATION} completed at $(date). Total runtime: $((${END} - ${START})) seconds."
```

------

获得 SSM 文档内容后，请参照以下过程之一创建自定义 SSM 文档。

------
#### [ Console ]

**创建 SSM 命令文档**

1. 打开 Amazon Systems Manager 控制台，网址为[https://console.aws.amazon.com//systems-manager/](https://console.amazonaws.cn//systems-manager/)。

1. 在导航窗格中，选择**文档**，然后选择**创建文档**、**命令或会话**。

1. 对于**名称**，为文档输入一个描述性名称。

1. 对于**目标类型**，选择**/AWS::EC2::Instance**。

1. 对于**文档类型**，请选择**命令**。

1. 在**内容**字段中，选择 **YAML**，然后粘贴文档内容。

1. 在**文档标签**部分，添加标签键为 `DLMScriptsAccess`、标签值为 `true` 的标签。
**重要**  
该`DLMScriptsAccess:true`标签是*步骤 3：准备 Amazon Data Lifecycle Manager IAM 角色*中使用的**AWSDataLifecycleManagerSSMFull访问** Amazon 托管策略所必需的。该策略使用 `aws:ResourceTag` 条件键来限制对带有此标签的 SSM 文档的访问权限。

1. 选择**创建文档**。

------
#### [ Amazon CLI ]

**创建 SSM 命令文档**  
使用 [create-document](https://docs.amazonaws.cn/cli/latest/reference/ssm/create-document.html) 命令。对于 `--name`，请为文档指定一个描述性名称。对于 `--document-type`，请指定 `Command`。对于 `--content`，请指定包含 SSM 文档内容的.yaml 文件的路径。对于 `--tags`，请指定 `"Key=DLMScriptsAccess,Value=true"`。

```
$ aws ssm create-document \
--content file://path/to/file/documentContent.yaml \
--name "document_name" \
--document-type "Command" \
--document-format YAML \
--tags "Key=DLMScriptsAccess,Value=true"
```

------

### 步骤 3：准备 Amazon Data Lifecycle Manager IAM 角色
<a name="prep-iam-role"></a>

**注意**  
如果出现以下情况，则需要执行此步骤：  
您可以创建或更新使用自定义 IAM 角色的 pre/post 支持脚本的快照策略。
您可以使用命令行创建或更新使用默认值的 pre/post 启用脚本的快照策略。
如果您使用控制台创建或更新使用默认角色管理快照的 pre/post 启用脚本的快照策略（**AWSDataLifecycleManagerDefaultRole**），请跳过此步骤。在这种情况下，我们会自动将**AWSDataLifecycleManagerSSMFull访问**策略附加到该角色。

您必须确保您用于策略的 IAM 角色授予 Amazon Data Lifecycle Manager 权限，以执行在策略作为目标的实例上运行前置和后置脚本所需的 SSM 操作。

Amazon Data Lifecycle Manager 提供了包含所需权限的托管策略（**AWSDataLifecycleManagerSSMFull访问**权限）。您可以将此策略附加到您的 IAM 角色以管理快照，从而确保其包含这些权限。

**重要**  
使用预脚本和后置脚本时， AWSDataLifecycleManagerSSMFull访问管理策略使用`aws:ResourceTag`条件键来限制对特定 SSM 文档的访问。要允许 Amazon Data Lifecycle Manager 访问 SSM 文档，您必须确保您的 SSM 文档带有 `DLMScriptsAccess:true` 标签。

或者，您可以手动创建自定义策略或将所需权限直接分配给您使用的 IAM 角色。您可以使用 AWSDataLifecycleManagerSSMFull访问管理策略中定义的相同权限，但是，`aws:ResourceTag`条件键是可选的。如果您决定不包含该条件键，则无需用 `DLMScriptsAccess:true` 标记您的 SSM 文档。

使用以下方法之一将**AWSDataLifecycleManagerSSMFull访问**策略添加到您的 IAM 角色。

------
#### [ Console ]

**将托管策略附加到您的自定义角色**

1. 使用 [https://console.aws.amazon.com/iam/](https://console.amazonaws.cn/iam/) 打开 IAM 控制台。

1. 在导航面板中，选择 **Roles**（角色）。

1. 搜索并选择用于管理快照的自定义角色。

1. 在**权限**选项卡上，选择**添加权限**、**附加策略**。

1. 搜索并选择**AWSDataLifecycleManagerSSMFull访问**托管策略，然后选择**添加权限**。

------
#### [ Amazon CLI ]

**将托管策略附加到您的自定义角色**  
使用 [ attach-role-policy](https://docs.amazonaws.cn/cli/latest/reference/iam/attach-role-policy.html) 命令。对于 `---role-name`，请指定您自定义角色的名称。对于 `--policy-arn`，请指定 `arn:aws:iam::aws:policy/AWSDataLifecycleManagerSSMFullAccess`。

```
$ aws iam attach-role-policy \
--policy-arn arn:aws:iam::aws:policy/AWSDataLifecycleManagerSSMFullAccess \
--role-name your_role_name
```

------

### 步骤 4：创建快照生命周期策略
<a name="prep-policy"></a>

要自动生成应用程序一致性快照，您必须创建以实例为目标的快照生命周期策略，并为该策略配置前置和后置脚本。

------
#### [ Console ]

**创建快照生命周期策略**

1. 打开位于 [https://console.aws.amazon.com/ec2/](https://console.amazonaws.cn/ec2/) 的 Amazon EC2 控制台。

1. 在导航窗格中，依次选择 **Elastic Block Store** 和**生命周期管理器**，然后选择**创建生命周期策略**。

1. 在**选择策略类型**页面上，选择 **EBS 快照策略**，然后选择**下一步**。

1. 在**目标资源**部分中，执行以下操作：

   1. 对于**目标资源类型**，请选择 `Instance`。

   1. 对于**目标资源标签**，请指定识别要备份的实例的资源标签。仅备份具有指定标签的资源。

1. 对于 **IAM 角色**，可以选择 **AWSDataLifecycleManagerDefaultRole**（用于管理快照的默认角色），也可以选择一个您创建并准备用于预处理和发布脚本的自定义角色。

1. 根据需要配置计划和其他选项。我们建议您将快照创建时间计划在与您工作负载相匹配的时间段，例如在维护窗口期间。

   对于 SAP HANA，我们建议您启用“快速快照还原”。
**注意**  
如果您为 VSS 备份启用计划，则无法启用**排除特定数据卷**或**从源中复制标签**。

1. 在**前置和后置脚本**部分中，选择**启用前置和后置脚本**，然后根据您的工作负载执行以下操作：
   + 要创建 Windows 应用程序的应用程序一致性快照，请选择 **VSS 备份**。
   + 要创建您的 SAP HANA 工作负载的应用程序一致性快照，请选择 **SAP HANA**。
   + **要使用自定义 SSM 文档为所有其他数据库和工作负载（包括自行管理的 MySQL、PostgreSQL InterSystems 或 IRIS 数据库）创建应用程序一致的快照，请选择自定义 SSM 文档。**

     1. 对于**自动化选项**，请选择**前置和后置脚本**。

     1. 对于 **SSM 文档**，请选择您准备的 SSM 文档。

1. 根据您所选的选项，配置以下其他选项：
   + **脚本超时** –（*仅限自定义 SSM 文档*）如果脚本运行尝试尚未完成，则在此超时期间后，Amazon Data Lifecycle Manager 的尝试失败。如果脚本未在其超时期间内完成，Amazon Data Lifecycle Manager 的尝试失败。超时期间分别适用于前置和后置脚本。最小的默认超时期间为 10 秒。最长超时期间为 120 秒。
   + **重试失败的脚本** – 选择此选项可重试未在其超时期间内完成的脚本。如果前置脚本失败，则 Amazon Data Lifecycle Manager 会重试整个快照创建过程，包括运行前置和后置脚本。如果后置脚本失败，则 Amazon Data Lifecycle Manager 将仅重试后置脚本；在这种情况下，前置脚本将完成并且可能已创建快照。
   + **默认创建崩溃一致性快照** – 如果前置脚本运行失败，则选择此选项以默认创建崩溃一致性快照。如果未启用前置和后置脚本，则这是 Amazon Data Lifecycle Manager 的默认快照创建行为。如果您启用了重试，则只有在所有重试尝试都用尽之后，Amazon Data Lifecycle Manager 才会默认创建崩溃一致性快照。如果前置脚本失败并且您没有默认创建崩溃一致性快照，则 Amazon Data Lifecycle Manager 将不会在该计划运行期间为实例创建快照。
**注意**  
如果您要为 SAP HANA 创建快照，则可能需要禁用此选项。无法以相同的方式还原 SAP HANA 工作负载的崩溃一致性快照。

1. 选择**创建默认策略**。
**注意**  
如果发生 `Role with name AWSDataLifecycleManagerDefaultRole already exists` 错误，请参阅 [排查 Amazon Data Lifecycle Manager 问题](dlm-troubleshooting.md) 来了解更多信息。

------
#### [ Amazon CLI ]

**创建快照生命周期策略**  
使用[create-lifecycle-policy](https://docs.amazonaws.cn/cli/latest/reference/dlm/create-lifecycle-policy.html)命令，并将`Scripts`参数包含在中`CreateRule`。有关参数的更多信息，请参阅 [Amazon Data Lifecycle Manager API Reference](https://docs.amazonaws.cn/dlm/latest/APIReference/API_Script.html)**。

```
$ aws dlm create-lifecycle-policy \
--description "policy_description" \
--state ENABLED \
--execution-role-arn iam_role_arn \
--policy-details file://policyDetails.json
```

其中 `policyDetails.json` 包含以下内容之一，具体取决于您的用例：
+ **VSS 备份**

  ```
  {
      "PolicyType": "EBS_SNAPSHOT_MANAGEMENT",
      "ResourceTypes": [
          "INSTANCE"
      ],
      "TargetTags": [{
          "Key": "tag_key",
          "Value": "tag_value"
      }],
      "Schedules": [{
          "Name": "schedule_name",
          "CreateRule": {
              "CronExpression": "cron_for_creation_frequency", 
              "Scripts": [{ 
                  "ExecutionHandler":"AWS_VSS_BACKUP",
                  "ExecuteOperationOnScriptFailure":true|false,
                  "MaximumRetryCount":retries (0-3)
              }]
          },
          "RetainRule": {
              "Count": retention_count
          }
      }]
  }
  ```
+ **SAP HANA 备份**

  ```
  {
      "PolicyType": "EBS_SNAPSHOT_MANAGEMENT",
      "ResourceTypes": [
          "INSTANCE"
      ],
      "TargetTags": [{
          "Key": "tag_key",
          "Value": "tag_value"
      }],
      "Schedules": [{
          "Name": "schedule_name",
          "CreateRule": {
              "CronExpression": "cron_for_creation_frequency", 
              "Scripts": [{ 
                  "Stages": ["PRE","POST"],
                  "ExecutionHandlerService":"AWS_SYSTEMS_MANAGER",
                  "ExecutionHandler":"AWSSystemsManagerSAP-CreateDLMSnapshotForSAPHANA",
                  "ExecuteOperationOnScriptFailure":true|false,
                  "ExecutionTimeout":timeout_in_seconds (10-120), 
                  "MaximumRetryCount":retries (0-3)
              }]
          },
          "RetainRule": {
              "Count": retention_count
          }
      }]
  }
  ```
+ **自定义 SSM 文档**

  ```
  {
      "PolicyType": "EBS_SNAPSHOT_MANAGEMENT",
      "ResourceTypes": [
          "INSTANCE"
      ],
      "TargetTags": [{
          "Key": "tag_key",
          "Value": "tag_value"
      }],
      "Schedules": [{
          "Name": "schedule_name",
          "CreateRule": {
              "CronExpression": "cron_for_creation_frequency", 
              "Scripts": [{ 
                  "Stages": ["PRE","POST"],
                  "ExecutionHandlerService":"AWS_SYSTEMS_MANAGER",
                  "ExecutionHandler":"ssm_document_name|arn",
                  "ExecuteOperationOnScriptFailure":true|false,
                  "ExecutionTimeout":timeout_in_seconds (10-120), 
                  "MaximumRetryCount":retries (0-3)
              }]
          },
          "RetainRule": {
              "Count": retention_count
          }
      }]
  }
  ```

------

## 使用 Amazon Data Lifecycle Manager 进行 VSS 备份的注意事项
<a name="app-consistent-vss"></a>

借助 Amazon Data Lifecycle Manager，您可以备份和还原在 Amazon EC2 实例上运行的启用 VSS（卷影复制服务）的 Windows 应用程序。如果应用程序已在 Windows VSS 中注册了 VSS 写入器，则 Amazon Data Lifecycle Manager 会为该应用程序创建具有应用程序一致性的快照。

**注意**  
Amazon Data Lifecycle Manager 目前仅支持在 Amazon EC2 上运行的资源的应用程序一致性快照，特别适用于可以通过将现有实例替换为从备份创建的新实例来还原应用程序数据的备份场景。并非所有实例类型或应用程序都支持 VSS 备份。有关更多信息，请参阅《Amazon EC2 用户指南》**中的[应用程序一致性 Windows VSS 快照](https://docs.amazonaws.cn/AWSEC2/latest/UserGuide/application-consistent-snapshots.html)。

**不支持的实例类型**  
以下 Amazon EC2 实例类型不支持 VSS 备份。如果您的策略以这些实例类型之一为目标，则 Amazon Data Lifecycle Manager 可能仍会创建 VSS 备份，但快照可能不会使用所需的系统标签进行标记。没有这些标签，快照在创建后将无法由 Amazon Data Lifecycle Manager 进行管理。您可能需要手动删除这些快照。
+ T3：`t3.nano` \$1 `t3.micro`
+ T3a：`t3a.nano` \$1 `t3a.micro`
+ T2：`t2.nano` \$1 `t2.micro`

## 应用程序一致性快照的共同责任
<a name="shared-responsibility"></a>

**您必须确保：**
+ SSM 代理已安装并在您的目标实例上运行 up-to-date
+ Systems Manager 有权在目标实例上执行所需操作
+ Amazon Data Lifecycle Manager 有权执行在目标实例上运行前置和后置脚本所需的 Systems Manager 操作。
+ 对于自定义工作负载，例如自行管理的 MySQL、PostgreSQL InterSystems 或 IRIS 数据库，您使用的 SSM 文档包含用于冻结、刷新和解冻数据库配置的正确和必需的操作。 I/O 
+ 快照创建时间与您的工作负载计划保持一致。例如，请尝试在计划的维护窗口期内安排快照创建。

**Amazon Data Lifecycle Manager 应确保：**
+ 快照创建将在计划快照创建时间的 60 分钟内启动。
+ 在启动快照创建之前运行前置脚本。
+ 在前置脚本成功且快照创建已启动后运行前置脚本。只有在前置脚本成功的情况下，Amazon Data Lifecycle Manager 才会运行后置脚本。如果前置脚本失败，Amazon Data Lifecycle Manager 将不会运行后置脚本。
+ 快照在创建时会用相应的标签进行标记。
+ CloudWatch 当脚本启动时，以及脚本失败或成功时，都会发出指标和事件。