Troubleshooting
CPM Server upgrade may fail with the "Reason: The instance ID 'i-1234567890abcdef' does not exist" and "Configuration Failed" errors
CPM Server upgrade may fail with the "Reason: The instance ID 'i-1234567890abcdef' does not exist" and "Configuration Failed" errors: The reason for this failure is using wrong AWS Access Key and Secret Key - maybe they belong to the wrong account.
Instance recovery may fail with the "Address xxx.xx.xx.xx is in use" error
Issue: When attempting instance recovery, it may fail with the following error in the recovery log: Error - First step (launching instance) failed. Reason: Address xxx.xx.xx.xx is in use. Solution: This usually happens because you are trying to ...
CPM fails to add a new account or update an existing one with the "Failed getting account number for account 'XXXYYYZZZ' in ALL regions" and/or "Can not connect to AWS" errors
CPM may fail to add a new account or update an existing one with the "Failed getting account number for account 'XXXYYYZZZ' in ALL regions" and/or "Can not connect to AWS" errors. There are 3 possible causes: 1. The errors in cpm_server.log look like ...
All CPM operations may fail with the "Name or service not known" error
CPM operations may start failing with the following error in the logs: ERROR: cleanup_function(.\cpmagent\agent.py:2543) could not delete AMI ami-87654321 in policy PolicyName.Reason: [Errno -2] Name or service not known (code: None) ERROR: ...
Snapshots or DR are failing with "Exception: <urlopen error [Errno 1] _ssl.c:510: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01>"
Sometimes snapshots/DR may start failing, while the local CPM Server's logs contain these repeated errors: ERROR: get_agent_tasks(.\common\agentapi_requests.py:374) failed calling url https://127.0.0.1/agentapi/agent_tasks/. Exception: <urlopen error ...
Replication of RDS snapshots may fail with "Cross region snapshot copy is not supported for TDE encrypted snapshots"
================================================================================================= As of June 13 2007, AWS has fixed this issue: ...
Cleanup may not delete snapshots older than the defined retention for the policy
CPM removes the snapshots in accordance with defined retention policies. However, only successful backups are counted towards the retention. It is crucial to understand what a "successful backup" is. A "successful backup" is a backup that has both ...
How to increase the VSS/scripts timeout on CPM 2.1 and higher
VSS timeout controls two parameters: 1) How long can a VSS operation take. 2) How much time can pass between VSS operations (e.g how much time can a snapshot take). In v3.x, this setting can be configured in "Policy Instance and Volume Configuration" ...
DR may delay replication of some snapshots (showing them as "pending") with "Too many snapshot copies in progress"
DR may delay replication of some snapshots (showing them as "pending") with the following message printed in the logs: "Volume DR We reached the max limit of snapshot copy requests for region US East (N. Virginia) (policy POLICY_NAME,in Backup ...
Snapshots may fail with "Reason: Maximum allowed active snapshot limit exceeded."
Snapshots may fail with "Reason: Maximum allowed active snapshot limit exceeded." This means that you have exceeded Amazon's limit of snapshots per account (10000 by default). ...
After restoring a RHEL 7.x instance, yum update stops working
After restoring a RHEL 7.x instance, yum update stops working: [root@hostname~]# yum update Loaded plugins: amazon-id, rhui-lb, search-disabled-repos epel/x86_64/metalink ...
"SnapshotQuotaExceeded" error when trying to backup RDS
Customer may receive the following error when trying to backup RDS: ERROR: get_object(connection.py:1202) 400 Bad Request ERROR: get_object(connection.py:1203) <ErrorResponse xmlns="http://rds.amazonaws.com/doc/2013-05-15/"> <Error> ...
"The system identified issues and will stop backing up" error during an upgrade
Customers may receive the following error during a CPM upgrade: The system identified issues and will stop backing up. This could be due to a downgrade of an existing server or expiration of license. CPM Server is in recovery mode and will ...
DR Fails with "Exception The parameter PresignedUrl is not recognized".
In CPM v1.8.x and 2.0.x DR may fail with the following error: Exception The parameter PresignedUrl is not recognized This happens due to a recent change in Amazon's APIs. In order to resolve this issue, you must upgrade to CPM 2.1.0 or higher. For ...
"Could not register trial, this account had been registered to a trial before" error
If during the setup of the CPM instance you receive the following error after step 5: Failed registering account. Reason: Could not register trial, this account had been registered to a trial before, please contact N2WS support Please choose "This ...
DR may fail with "Error: No KMS info for region" and "failed getting kms aliases"
The following error may appear in the Backup Log: Error - Failed looking for source KMS (region US West (Oregon), snapshot snap-xxxxxxxx). Error: No KMS info for region us-west-2 (No KMS) This error may appear in cpm_server.log: ERROR: ...
RDS Backup or DR fails with Reason: cannot create more than 100 manual snapshots
Issue: Backup or DR may fail with the following errors: ERROR: run_snapshots(.\agent.py:880) (database: "dbname") snapshot failed to start. Reason: cannot create more than 100 manual snapshots ERROR: backup_function_internal(.\agent.py:1204) policy: ...
DR Fails with ERROR: start_copy_region(n2wsoftware\cpmserver\cpm\dr_ami.py:282) AMI DR copy_image failed from region "regionname1" to region "regionname2". AMI ami-xxxxxxxx, policy "policyname" (in "accountname" account). Exception You are not authorized to perform this operation.
This is caused by the "CopyImage" permission missing from your IAM policy. As a result, CPM fails to copy the image and try again after the next backup completes.
CPM Configuration process hangs or fails
The CPM Configuration process is pretty simple and straightforward. IF it hangs or fails it probably means that wither there was a connectivity problem (CPM could not access the Internet): this usually causes a long hang and failure at the end, or a ...