Troubleshooting a custom key store - Amazon Key Management Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Troubleshooting a custom key store

Amazon CloudHSM key stores are designed to be available and resilient. However, there are some error conditions that you might have to repair to keep your Amazon CloudHSM key store operational.

How to fix unavailable KMS keys

The key state of Amazon KMS keys in an Amazon CloudHSM key store is typically Enabled. Like all KMS keys, the key state changes when you disable the KMS keys in an Amazon CloudHSM key store or schedule them for deletion. However, unlike other KMS keys, the KMS keys in a custom key store can also have a key state of Unavailable.

A key state of Unavailable indicates that the KMS key is in a custom key store that was intentionally disconnected and attempts to reconnect it, if any, failed. While a KMS key is unavailable, you can view and manage the KMS key, but you cannot use it for cryptographic operations.

To find the key state of a KMS key, on the Customer managed keys page, view the Status field of the KMS key. Or, use the DescribeKey operation and view the KeyState element in the response. For details, see Viewing keys.

The KMS keys in a disconnected custom key store will have a key state of Unavailable or PendingDeletion. KMS keys that are scheduled for deletion from a custom key store have a Pending Deletion key state, even when the custom key store is disconnected. This allows you to cancel the scheduled key deletion without reconnecting the custom key store.

To fix an unavailable KMS key, reconnect the custom key store. After the custom key store is reconnected, the key state of the KMS keys in the custom key store is automatically restored to its previous state, such as Enabled or Disabled. KMS keys that are pending deletion remain in the PendingDeletion state. However, while the problem persists, enabling and disabling an unavailable KMS key does not change its key state. The enable or disable action takes effect only when the key becomes available.

For help with failed connections, see How to fix a connection failure.

How to fix a failing KMS key

Problems with creating and using KMS keys in Amazon CloudHSM key stores can be caused by a problem with your Amazon CloudHSM key store, its associated Amazon CloudHSM cluster, the KMS key, or its key material.

When an Amazon CloudHSM key store is disconnected from its Amazon CloudHSM cluster, the key state of KMS keys in the custom key store is Unavailable. All requests to create KMS keys in a disconnected Amazon CloudHSM key store return a CustomKeyStoreInvalidStateException exception. All requests to encrypt, decrypt, re-encrypt, or generate data keys return a KMSInvalidStateException exception. To fix the problem, reconnect the Amazon CloudHSM key store.

However, your attempts to use a KMS key in an Amazon CloudHSM key store for cryptographic operations might fail even when its key state is Enabled and the connection state of the Amazon CloudHSM key store is Connected. This might be caused by any of the following conditions.

  • The key material for the KMS key might have been deleted from the associated Amazon CloudHSM cluster. To investigate, find the key handle of the key material for a KMS key and, if necessary, try to recover the key material.

  • All HSMs were deleted from the Amazon CloudHSM cluster that is associated with the Amazon CloudHSM key store. To use a KMS key in an Amazon CloudHSM key store in a cryptographic operation, its Amazon CloudHSM cluster must contain at least one active HSM. To verify the number and state of HSMs in an Amazon CloudHSM cluster, use the Amazon CloudHSM console or the DescribeClusters operation. To add an HSM to the cluster, use the Amazon CloudHSM console or the CreateHsm operation.

  • The Amazon CloudHSM cluster associated with the Amazon CloudHSM key store was deleted. To fix the problem, create a cluster from a backup that is related to the original cluster, such as a backup of the original cluster, or a backup that was used to create the original cluster. Then, edit the cluster ID in the custom key store settings. For instructions, see How to recover deleted key material for a KMS key.

  • The Amazon CloudHSM cluster associated with the custom key store did not have any available PKCS #11 sessions. This typically occurs during periods of high burst traffic when additional sessions are needed to service the traffic. To respond to a KMSInternalException with an error message about PKCS #11 sessions, back off and retry the request again.

How to fix a connection failure

If you try to connect an Amazon CloudHSM key store to its Amazon CloudHSM cluster, but the operation fails, the connection state of the Amazon CloudHSM key store changes to FAILED. To find the connection state of an Amazon CloudHSM key store, use the Amazon KMS console or the DescribeCustomKeyStores operation.

Alternatively, some connection attempts fail quickly due to easily detected cluster configuration errors. In this case, the connection state is still DISCONNECTED. These failures return an error message or exception that explains why the attempt failed. Review the exception description and cluster requirements, fix the problem, update the Amazon CloudHSM key store, if necessary, and try to connect again.

When the connection state is FAILED, run the DescribeCustomKeyStores operation and see the ConnectionErrorCode element in the response.

Note

When the connection state of an Amazon CloudHSM key store is FAILED, you must disconnect the Amazon CloudHSM key store before attempting to reconnect it. You cannot connect an Amazon CloudHSM key store with a FAILED connection state.

  • CLUSTER_NOT_FOUND indicates that Amazon KMS cannot find an Amazon CloudHSM cluster with the specified cluster ID. This might occur because the wrong cluster ID was provided to an API operation or the cluster was deleted and not replaced. To fix this error, verify the cluster ID, such as by using the Amazon CloudHSM console or the DescribeClusters operation. If the cluster was deleted, create a cluster from a recent backup of the original. Then, disconnect the Amazon CloudHSM key store, edit the Amazon CloudHSM key store cluster ID setting, and reconnect the Amazon CloudHSM key store to the cluster.

  • INSUFFICIENT_CLOUDHSM_HSMS indicates that the associated Amazon CloudHSM cluster does not contain any HSMs. To connect, the cluster must have at least one HSM. To find the number of HSMs in the cluster, use the DescribeClusters operation. To resolve this error, add at least one HSM to the cluster. If you add multiple HSMs, it's best to create them in different Availability Zones.

  • INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET indicates that Amazon KMS could not connect the Amazon CloudHSM key store to its Amazon CloudHSM cluster because at least one private subnet associated with the cluster doesn't have any available IP addresses. An Amazon CloudHSM key store connection requires one free IP address in each of the associated private subnets, although two are preferable.

    You can't add IP addresses (CIDR blocks) to an existing subnet. If possible, move or delete other resources that are using the IP addresses in the subnet, such as unused EC2 instances or elastic network interfaces. Otherwise, you can create a cluster from a recent backup of the Amazon CloudHSM cluster with new or existing private subnets that have more free address space. Then, to associate the new cluster with your Amazon CloudHSM key store, disconnect the custom key store, change the cluster ID of the Amazon CloudHSM key store to the ID of the new cluster, and try to connect again.

    Tip

    To avoid resetting the kmsuser password, use the most recent backup of the Amazon CloudHSM cluster.

  • INTERNAL_ERROR indicates that Amazon KMS could not complete the request due to an internal error. Retry the request. For ConnectCustomKeyStore requests, disconnect the Amazon CloudHSM key store before trying to connect again.

  • INVALID_CREDENTIALS indicates that Amazon KMS cannot log into the associated Amazon CloudHSM cluster because it doesn't have the correct kmsuser account password. For help with this error, see How to fix invalid kmsuser credentials.

  • NETWORK_ERRORS usually indicates transient network issues. Disconnect the Amazon CloudHSM key store, wait a few minutes, and try to connect again.

  • SUBNET_NOT_FOUND indicates that at least one subnet in the Amazon CloudHSM cluster configuration was deleted. If Amazon KMS cannot find all of the subnets in the cluster configuration, attempts to connect the Amazon CloudHSM key store to the Amazon CloudHSM cluster fail.

    To fix this error, create a cluster from a recent backup of the same Amazon CloudHSM cluster. (This process creates a new cluster configuration with a VPC and private subnets.) Verify that the new cluster meets the requirements for a custom key store, and note the new cluster ID. Then, to associate the new cluster with your Amazon CloudHSM key store, disconnect the custom key store, change the cluster ID of the Amazon CloudHSM key store to the ID of the new cluster, and try to connect again.

    Tip

    To avoid resetting the kmsuser password, use the most recent backup of the Amazon CloudHSM cluster.

  • USER_LOCKED_OUT indicates that the kmsuser crypto user (CU) account is locked out of the associated Amazon CloudHSM cluster due to too many failed password attempts. For help with this error, see How to fix invalid kmsuser credentials.

    To fix this error, disconnect the Amazon CloudHSM key store and use the changePswd command in cloudhsm_mgmt_util to change the kmsuser account password. Then, edit the kmsuser password setting for the custom key store, and try to connect again. For help, use the procedure described in the How to fix invalid kmsuser credentials topic.

  • USER_LOGGED_IN indicates that the kmsuser CU account is logged into the associated Amazon CloudHSM cluster. This prevents Amazon KMS from rotating the kmsuser account password and logging into the cluster. To fix this error, log the kmsuser CU out of the cluster. If you changed the kmsuser password to log into the cluster, you must also and update the key store password value for the Amazon CloudHSM key store. For help, see How to log out and reconnect.

  • USER_NOT_FOUND indicates that Amazon KMS cannot find a kmsuser CU account in the associated Amazon CloudHSM cluster. To fix this error, create a kmsuser CU account in the cluster, and then update the key store password value for the Amazon CloudHSM key store. For help, see How to fix invalid kmsuser credentials.

How to respond to a cryptographic operation failure

A cryptographic operation that uses a KMS key in a custom key store might fail with an error such as the following.

KMSInvalidStateException: KMS cannot communicate with your CloudHSM cluster

Although this is an HTTPS 400 error, it might result from transient network issues. To respond, begin by retrying the request. However, if it continues to fail, examine the configuration of your networking components. This error is most likely caused by the misconfiguration of a networking component, such as a firewall rule or VPC security group rule that is blocking outgoing traffic.

How to fix invalid kmsuser credentials

When you connect an Amazon CloudHSM key store, Amazon KMS logs into the associated Amazon CloudHSM cluster as the kmsuser crypto user (CU). It remains logged in until the Amazon CloudHSM key store is disconnected. The DescribeCustomKeyStores response shows a ConnectionState of FAILED and ConnectionErrorCode value of INVALID_CREDENTIALS, as shown in the following example.

If you disconnect the Amazon CloudHSM key store and change the kmsuser password, Amazon KMS cannot log into the Amazon CloudHSM cluster with the credentials of the kmsuser CU account. As a result, all attempts to connect the Amazon CloudHSM key store fail. The DescribeCustomKeyStores response shows a ConnectionState of FAILED and ConnectionErrorCode value of INVALID_CREDENTIALS, as shown in the following example.

$ aws kms describe-custom-key-stores --custom-key-store-name ExampleKeyStore { "CustomKeyStores": [ "CloudHsmClusterId": "cluster-1a23b4cdefg", "ConnectionErrorCode": "INVALID_CREDENTIALS" "CustomKeyStoreId": "cks-1234567890abcdef0", "CustomKeyStoreName": "ExampleKeyStore", "TrustAnchorCertificate": "<certificate string appears here>", "CreationDate": "1.499288695918E9", "ConnectionState": "FAILED" ], }

Also, after five failed attempts to log into the cluster with an incorrect password, Amazon CloudHSM locks the user account. To log into the cluster, you must change the account password.

If Amazon KMS gets a lockout response when it tries to log into the cluster as the kmsuser CU, the request to connect the Amazon CloudHSM key store fails. The DescribeCustomKeyStores response includes a ConnectionState of FAILED and ConnectionErrorCode value of USER_LOCKED_OUT, as shown in the following example.

$ aws kms describe-custom-key-stores --custom-key-store-name ExampleKeyStore { "CustomKeyStores": [ "CloudHsmClusterId": "cluster-1a23b4cdefg", "ConnectionErrorCode": "USER_LOCKED_OUT" "CustomKeyStoreId": "cks-1234567890abcdef0", "CustomKeyStoreName": "ExampleKeyStore", "TrustAnchorCertificate": "<certificate string appears here>", "CreationDate": "1.499288695918E9", "ConnectionState": "FAILED" ], }

To repair any of these conditions, use the following procedure.

  1. Disconnect the Amazon CloudHSM key store.

  2. Run the DescribeCustomKeyStores operation and view the value of the ConnectionErrorCode element in the response.

    • If the ConnectionErrorCode value is INVALID_CREDENTIALS, determine the current password for the kmsuser account. If necessary, use the changePswd command in cloudhsm_mgmt_util to set the password to a known value.

    • If the ConnectionErrorCode value is USER_LOCKED_OUT, you must use the changePswd command in cloudhsm_mgmt_util to change the kmsuser password.

  3. Edit the kmsuser password setting so it matches the current kmsuser password in the cluster. This action tells Amazon KMS which password to use to log into the cluster. It does not change the kmsuser password in the cluster.

  4. Connect the custom key store.

How to delete orphaned key material

After scheduling deletion of a KMS key from an Amazon CloudHSM key store, you might need to manually delete the corresponding key material from the associated Amazon CloudHSM cluster.

When you create a KMS key in an Amazon CloudHSM key store, Amazon KMS creates the KMS key metadata in Amazon KMS and generates the key material in the associated Amazon CloudHSM cluster. When you schedule deletion of a KMS key in an Amazon CloudHSM key store, after the waiting period, Amazon KMS deletes the KMS key metadata. Then Amazon KMS makes a best effort to delete the corresponding key material from the Amazon CloudHSM cluster. The attempt might fail if Amazon KMS cannot access the cluster, such as when it's disconnected from the Amazon CloudHSM key store or the kmsuser password changes. Amazon KMS does not attempt to delete key material from cluster backups.

Amazon KMS reports the results of its attempt to delete the key material from the cluster in the DeleteKey event entry of your Amazon CloudTrail logs. It appears in the backingKeysDeletionStatus element of the additionalEventData element, as shown in the following example entry. The entry also includes the KMS key ARN, the Amazon CloudHSM cluster ID, and the key handle of the key material (backing-key-id).

{ "eventVersion": "1.08", "userIdentity": { "accountId": "111122223333", "invokedBy": "Amazon Internal" }, "eventTime": "2021-12-10T14:23:51Z", "eventSource": "kms.amazonaws.com", "eventName": "DeleteKey", "awsRegion": "eu-west-1", "sourceIPAddress": "Amazon Internal", "userAgent": "AWS Internal", "requestParameters": null, "responseElements": null, "additionalEventData": { "customKeyStoreId": "cks-1234567890abcdef0", "clusterId": "cluster-1a23b4cdefg", "backingKeys": "[{\"keyHandle\":\"01\",\"backingKeyId\":\"backing-key-id\"}]", "backingKeysDeletionStatus": "[{\"keyHandle\":\"16\",\"backingKeyId\":\"backing-key-id\",\"deletionStatus\":\"FAILURE\"}]" }, "eventID": "c21f1f47-f52b-4ffe-bff0-6d994403cf40", "readOnly": false, "resources": [ { "accountId": "111122223333", "type": "AWS::KMS::Key", "ARN": "arn:aws:kms:eu-west-1:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab" } ], "eventType": "AwsServiceEvent", "recipientAccountId": "111122223333", "managementEvent": true, "eventCategory": "Management" }

To delete the key material from the associated Amazon CloudHSM cluster, use a procedure like the following one. This example uses the Amazon CLI and Amazon CloudHSM command line tools, but you can use the Amazon Web Services Management Console instead of the CLI.

  1. Disconnect the Amazon CloudHSM key store, if it is not already disconnected, then log into the key_mgmt_util, as explained in How to disconnect and log in.

  2. Use the deleteKey command in key_mgmt_util to delete the key from the HSMs in the cluster.

    For example, this command deletes key 262162 from the HSMs in the cluster. The key handle is listed in the CloudTrail log entry.

    Command: deleteKey -k 262162 Cfm3DeleteKey returned: 0x00 : HSM Return: SUCCESS Cluster Error Status Node id 0 and err state 0x00000000 : HSM Return: SUCCESS Node id 1 and err state 0x00000000 : HSM Return: SUCCESS Node id 2 and err state 0x00000000 : HSM Return: SUCCESS
  3. Log out of key_mgmt_util and reconnect the Amazon CloudHSM key store as described in How to log out and reconnect.

How to recover deleted key material for a KMS key

If the key material for an Amazon KMS key is deleted, the KMS key is unusable and all ciphertext that was encrypted under the KMS key cannot be decrypted. This can happen if the key material for a KMS key in an Amazon CloudHSM key store is deleted from the associated Amazon CloudHSM cluster. However, it might be possible to recover the key material.

When you create an Amazon KMS key (KMS key) in an Amazon CloudHSM key store, Amazon KMS logs into the associated Amazon CloudHSM cluster and creates the key material for the KMS key. It also changes the password to a value that only it knows and remains logged in as long as the Amazon CloudHSM key store is connected. Because only the key owner, that is, the CU who created a key, can delete the key, it is unlikely that the key will be deleted from the HSMs accidentally.

However, if the key material for a KMS key is deleted from the HSMs in a cluster, the key state of the KMS key eventually changes to UNAVAILABLE. If you attempt to use the KMS key for a cryptographic operation, the operation fails with a KMSInvalidStateException exception. Most importantly, any data that was encrypted under the KMS key cannot be decrypted.

Under certain circumstances, you can recover deleted key material by creating a cluster from a backup that contains the key material. This strategy works only when at least one backup was created while the key existed and before it was deleted.

Use the following process to recover the key material.

  1. Find a cluster backup that contains the key material. The backup must also contain all users and keys that you need to support the cluster and its encrypted data.

    Use the DescribeBackups operation to list the backups for a cluster. Then use the backup timestamp to help you select a backup. To limit the output to the cluster that is associated with the Amazon CloudHSM key store, use the Filters parameter, as shown in the following example.

    $ aws cloudhsmv2 describe-backups --filters clusterIds=<cluster ID> { "Backups": [ { "ClusterId": "cluster-1a23b4cdefg", "BackupId": "backup-9g87f6edcba", "CreateTimestamp": 1536667238.328, "BackupState": "READY" }, ... ] }
  2. Create a cluster from the selected backup. Verify that the backup contains the deleted key and other users and keys that the cluster requires.

  3. Disconnect the Amazon CloudHSM key store so you can edit its properties.

  4. Edit the cluster ID of the Amazon CloudHSM key store. Enter the cluster ID of the cluster that you created from the backup. Because the cluster shares a backup history with the original cluster, the new cluster ID should be valid.

  5. Reconnect the Amazon CloudHSM key store.

How to log in as kmsuser

To create and manage the key material in the Amazon CloudHSM cluster for your Amazon CloudHSM key store, Amazon KMS uses the kmsuser crypto user (CU) account. You create the kmsuser CU account in your cluster and provide its password to Amazon KMS when you create your Amazon CloudHSM key store.

In general, Amazon KMS manages the kmsuser account. However, for some tasks, you need to disconnect the Amazon CloudHSM key store, log into the cluster as the kmsuser CU, and use the cloudhsm_mgmt_util and key_mgmt_util command line tools.

Note

While a custom key store is disconnected, all attempts to create KMS keys in the custom key store or to use existing KMS keys in cryptographic operations will fail. This action can prevent users from storing and accessing sensitive data.

This topic explains how to disconnect your Amazon CloudHSM key store and log in as kmsuser, run the Amazon CloudHSM command line tool, and log out and reconnect your Amazon CloudHSM key store.

How to disconnect and log in

Use the following procedure each time to need to log into an associated cluster as the kmsuser CU.

  1. Disconnect the Amazon CloudHSM key store, if it is not already disconnected. You can use the Amazon KMS console or Amazon KMS API.

    While your Amazon CloudHSM key is connected, Amazon KMS is logged in as the kmsuser. This prevents you from logging in as kmsuser or changing the kmsuser password.

    For example, this command uses DisconnectCustomKeyStore to disconnect an example key store. Replace the example Amazon CloudHSM key store ID with a valid one.

    $ aws kms disconnect-custom-key-store --custom-key-store-id cks-1234567890abcdef0
  2. Start cloudhsm_mgmt_util. Use the procedure described in Prepare to run cloudhsm_mgmt_util section of the Amazon CloudHSM User Guide.

  3. Log into cloudhsm_mgmt_util on the Amazon CloudHSM cluster as a crypto officer (CO).

    For example, this command logs in as a CO named admin. Replace the example CO user name and password with valid values.

    aws-cloudhsm>loginHSM CO admin <password> loginHSM success on server 0(10.0.2.9) loginHSM success on server 1(10.0.3.11) loginHSM success on server 2(10.0.1.12)
  4. Use the changePswd command to change the password of the kmsuser account to one that you know. (Amazon KMS rotates the password when you connect your Amazon CloudHSM key store.) The password must consist of 7-32 alphanumeric characters. It is case-sensitive and cannot contain any special characters.

    For example, this command changes the kmsuser password to tempPassword.

    aws-cloudhsm>changePswd CU kmsuser tempPassword *************************CAUTION******************************** This is a CRITICAL operation, should be done on all nodes in the cluster. Cav server does NOT synchronize these changes with the nodes on which this operation is not executed or failed, please ensure this operation is executed on all nodes in the cluster. **************************************************************** Do you want to continue(y/n)?y Changing password for kmsuser(CU) on 3 nodes
  5. Log into key_mgmt_util or cloudhsm_mgmt_util as kmsuser using the password that you set. For detailed instructions, see Getting Started with cloudhsm_mgmt_util and Getting Started with key_mgmt_util. The tool that you use depends on your task.

    For example, this command logs into key_mgmt_util.

    Command: loginHSM -u CU -s kmsuser -p tempPassword Cfm3LoginHSM returned: 0x00 : HSM Return: SUCCESS Cluster Error Status Node id 0 and err state 0x00000000 : HSM Return: SUCCESS Node id 1 and err state 0x00000000 : HSM Return: SUCCESS Node id 2 and err state 0x00000000 : HSM Return: SUCCESS

How to log out and reconnect

  1. Perform the task, then log out of the command line tool. If you do not log out, attempts to reconnect your Amazon CloudHSM key store will fail.

    Command: logoutHSM Cfm3LogoutHSM returned: 0x00 : HSM Return: SUCCESS Cluster Error Status Node id 0 and err state 0x00000000 : HSM Return: SUCCESS Node id 1 and err state 0x00000000 : HSM Return: SUCCESS
  2. Edit the kmsuser password setting for the custom key store.

    This tells Amazon KMS the current password for kmsuser in the cluster. If you omit this step, Amazon KMS will not be able to log into the cluster as kmsuser, and all attempts to reconnect your custom key store will fail. You can use the Amazon KMS console or the KeyStorePassword parameter of the UpdateCustomKeyStore operation.

    For example, this command tells Amazon KMS that the current password is tempPassword. Replace the example password with the actual one.

    $ aws kms update-custom-key-store --custom-key-store-id cks-1234567890abcdef0 --key-store-password tempPassword
  3. Reconnect the Amazon KMS key store to its Amazon CloudHSM cluster. Replace the example Amazon CloudHSM key store ID with a valid one. During the connection process, Amazon KMS changes the kmsuser password to a value that only it knows.

    The ConnectCustomKeyStore operation returns quickly, but the connection process can take an extended period of time. The initial response does not indicate the success of the connection process.

    $ aws kms connect-custom-key-store --custom-key-store-id cks-1234567890abcdef0
  4. Use the DescribeCustomKeyStores operation to verify that the Amazon CloudHSM key store is connected. Replace the example Amazon CloudHSM key store ID with a valid one.

    In this example, the connection state field shows that the Amazon CloudHSM key store is now connected.

    $ aws kms describe-custom-key-stores --custom-key-store-id cks-1234567890abcdef0 { "CustomKeyStores": [ "CustomKeyStoreId": "cks-1234567890abcdef0", "CustomKeyStoreName": "ExampleKeyStore", "CloudHsmClusterId": "cluster-1a23b4cdefg", "TrustAnchorCertificate": "<certificate string appears here>", "CreationDate": "1.499288695918E9", "ConnectionState": "CONNECTED" ], }