Troubleshooting
Copy to S3 fails after upgrade to 4.2.2
Issue: Backup of snapshots to S3 bucket is failing after upgrade from 4.1 to 4.2.2 Solution: One possible reason for such an issue is the change in default behavior. In 4.2.2 we introduced the option to launch a worker with a IAM role attached to it. ...
S3 Cleanup fails with error on worker - No progress was reported - terminating operation
Issue: S3 cleanup fails and reports error from worker - No progress was reported - terminating operation This occurs in version 4.2.2 and the error occurs when the worker reaches its 1 hour timeout while attempting to cleanup large snapshots. To fix ...
CPM in Recovery Mode error - S3 worker fails with worker i-12345678912345678 stopped communicating - terminating operation
CPM in Recovery Mode error - S3 worker fails with worker i-12345678912345678 stopped communicating - terminating operation Issue: When CPM is in recovery mode being used for DR drills, the original CPM’s process to detect “zombie” worker’s can be ...
N2WS-22694 - Post upgrade recovery screen running slow
Issue: UI for recovery screen is working very slowly or getting timed out. Solution: Patch for v4.2.0 is available and attached to this KB. The root cause is the new feature of custom recovery tags. there is a performance issue in case of huge ...
N2WS-22738 - Wrong hostname for STS China regions
Issue: Backups in China region fails with Could not connect to endpoint URL: “https://sts.cn-northwest-1.amazonaws.com Solution: Patch for v4.2.0 is available and attached to this KB.
How to reset MFA or root password using Linux utility in 4.2 and above
Background: in 4.2 we have added a Linux utility that comes with the server, this utility let you disable MFA or rest password for root user. Notes: if you do not have SSH access to the N2WS server, then the only other way to disable MFA or rest ...
N2WS-22420 - Copy to S3 failing with local device /dev//dev/xvdf (/dev/xvdf) does not exist
Issue: Copy To S3 is failing with the following error: Solution: Patch for v4.1.1a is available and attached to this KB. Click here to go back to the Release notes: Release notes for the latest v4.1.x CPM release
CPM installation on Azure fails with error: Could not create disk in location us-east Reason: (Authorization Failure) The client does not have authorization to perform action Microsoft.Compute/disks/write over scope.
CPM for Azure installation may receive the error Could not create disk in location us-east Reason: (Authorization Failure) The client does not have authorization to perform action Microsoft.Compute/disks/write over scope. This is caused by not ...
Download logs fails with error 500 and apache does not start or Watchdog killed DR process
Issue: Trying to download logs from the UI fails with error 500 or error Critical Error Watchdog Killed DR process after not responding for more than 1800 seconds And apache logs showing permission error in /var/log/apache2/error.log (you will need ...
FLR or Copy to S3 with Exception: could not assume role
Issues: When doing file level recovery(FLR) or copy to S3 operation in the same account, N2WS might need to assume its own role to generate a token for the worker, this could lead to the below error ERROR: ...
How to use AWS IAM Policy Simulator to troubleshoot N2WS Backup permission issues.
Background: Permission issues are one of the most common errors seen by users of N2WS Backup and this article explains how you can use the IAM Policy Simulator to help you narrow down whether permissions are allowed by an IAM User or a Role. This ...
Upgrade fails with error "Could not mount the new volume, returned: 8192"
If you get this error message during an upgrade, it is very likely that the filesystem is corrupted on the CPMData Volume. Please perform these steps: 1) Connect to N2WS EC2 instance where the upgrade failed using user "cpmuser" and your Key 2) ...
Copy of RDS to DR target account may fail with Reason code: Unable to create the resource. Verify that you have permission to create service linked role.
DR copy job may fail with the following error: Error: RDS DR copy failed for region EU Paris (any region can apply) with reason code: Unable to create the resource. Verify that you have permission to create service-linked role. Otherwise, wait and ...
AWS to Azure - Subscription dropdown is empty
Issue: When creating an azure policy in CPM(AWS Based), the dropdown to select the subscription might show nothing Solution: The issue is related to permissions, either the role doesnt have the correct permissions or it was not linked to the app ...
CPM Crashed/Unavailable
Issue: CPM UI is unavailable or CPM Server instance status check is not 2/2 Workarounds and solutions: 1. Make sure that you are using HTTPS in the URL when trying to open N2WS. 2. The most common issue for UI being unavailable is resource issues ...
Troubleshooting common Cost Explorer issues
This document will go through the steps one can take to resolve common issues related to the CPM Cost Explorer feature 1. Required Permissions Make sure that you have updated the CPM instance role and all users associated with CPM with the latest ...
Upgrade may fail with the "Failed labeling and mounting after 10 retries, giving up" and "Could not mount the new volume, returned: 256" errors
Upgrade using AMI may fail with the following error in the file config_shell_commands.log: 2021-02-08 18:07:06,610: perform_shell_commands(:339) Could not mount the new volume, returned: 256 2021-02-08 18:07:06,610: perform_shell_commands(:349) ...
EBS/RDS DR & Recovery with KMS key
When copying cross account an EBS/RDS Volume encrypted with custom KMS, a KMS key should also be available in the other account. There are 2 ways that CPM uses for checking KMS key - Alias & Tag KMS Tag When using custom tag, you are telling CPM to ...
N2WS 3.1b/3.2.x - Very slow copy to S3 Bucket
Issue summary You might face very slow speed for the copy to S3 operation due to EBS API timeouts. Please see below troubleshooting and additional performance suggestions Troubleshooting The most common issue with copy to S3 speed is timeout to the ...
Troubleshooting File Level Recovery (FLR) 3.2
Background: File-level recovery requires N2WS to launch temporary worker instance in the target region. The worker will read the snapshot directly or recover volumes in the background and attach them to a ‘worker’ instance launched for the operation. ...
CPM 3.1 How to configure N2WS to resolve partially Successful backups
Info: This KB will show how to resolve common issues that cause "partially Successful backups" status Solutions: A backup may be marked "Partially Successful" due to the "Instance probably doesn't exist anymore" error: Error - (instance: ...
N2WS 3.1.x - Warning Could not process CBT for volume snapshots
Issue summary When running backup for EBS volumes on version 3.1 it might raise the following warning: Issue description and troubleshooting This warning can be caused by permission issues, communication or lack of EBS Direct API endpoint in target ...
N2WS 3.1.x - Warning 'Error verifying access to EBS API in region: us-east-1. CPM will not use read from snapshot for instances in this region'
Issue summary When running copy to S3 on version 3.1 it might raise the following warning. Issue description and troubleshooting This warning can be caused by permission issues, communication or lack of EBS Direct API endpoint in target region. In ...
N2WS 3.1.x - worker reported error: failed processing segment 0 : Failed to load blocks meta-data
Issue summary When running copy to S3 on version 3.1 it might fails with below error: Issue description and troubleshooting This error is usually caused by communication issues with the EBS Direct API endpoint. In version 3.1 we have started to use ...
Cross region copy of RDS snapshots may fail with error: RDS DR copy snapshot failed (in Backup account). No matching KMS alias in target region
Issue: The Following error may appear in the CPM Server logs: ERROR: start_copy_region(dr_rds.py:301) RDS DR copy snapshot failed (in Backup account). No matching KMS alias on target region (source EU (Frankfurt), target Asia Pacific (Singapore), RDS ...
How to test the worker configuration from UI
Background: This document explains the steps to test a CPM Worker configuration. This can help reduce errors during S3 copy and File-Level restores by being able to confirm the settings used for these jobs are successfully able to connect. Worker ...
What steps can you take to mitigate crashes due to performance issues on the CPM Instance.
Background The following contains steps that customers can enact to help performance issues like excess memory or CPU or crashes. This KB article assumes that the CPM instance is sized correctly and has the correct volume type as per the Recommended ...
Volume Snapshot may fail with error "The maximum per volume CreateSnapshot request rate has been exceeded"
Volume Snapshots may fail with the following errors in the log files. backup log 02-12-20 11:07:22 - Error - create snapshot of instance volume vol-xxxxxxxxxxxxx failed. 02-12-19 11:07:22 - Error - snapshot of instance ec2-xxxxxxxxx-xxxx-xx, volume ...
Troubleshooting File Level Recovery (FLR) communication issue
CPM Configuration File-level recovery requires N2WS to recover volumes in the background and attach them to a temporary EC2 ‘worker’ launched for the operation, The worker will be launched in the same account and region as the snapshots being ...
EFS - cross account tag scan may fail with Error: Invalid IAM role ARN
When adding EFS to a policy via a tag, it may fail with one of the following errors: (tag for example: efstesting+vault=n2ws+exp_opt=D+exp_opt_val=30+role_arn=arn:aws:iam::12345678:role/CPM) Critical Error - Can't update EFS to backup targets. Error: ...
EFS backup may fail with the following error: Error - failed starting EFS fs-1234abcd backup (vault: Default, iam role: AWSBackupDefaultServiceRole), policy Dailyjob_Policy. Exception: IAM Role arn:aws:iam::############:role/service-role/AWSBackupDefaultServiceRole cannot be assumed by AWS Backup
Problem: EFS backups may fail with the following error: Error - failed starting EFS fs-1234abcd backup (vault: Default, iam role: AWSBackupDefaultServiceRole), policy Dailyjob_Policy. Exception: IAM Role ...
Installation or upgrade of CPM may fail with the error "Can't connect to AWS: Wrong credentials or connectivity issue: "AccessKeyID""
Installation or upgrade of CPM may fail with the error "Can't connect to AWS: Wrong credentials or connectivity issue: "AccessKeyID"" After ensuring that the JSON permissions files have been updated, confirm that you have connectivity to the AWS ...
S3 Backup may fail after upgrade to v2.6 with the error "Error - Workers could not be launched"
S3 Backup may fail after upgrade to v2.6 with the error "Error - Workers could not be launched". This issue may happen if subnet is set to "Any" in the worker configuration. The following errors may be logged in the Backup Log: Info - Starting copy ...
S3 Cleanup may fail after upgrade to v2.6.0 if there is no worker configuration for the account/region in which S3 Repository resides
S3 Cleanup may fail after upgrade to v2.6.0 if there is no worker configuration for the account/region in which S3 Repository resides The following errors may be printed in the S3 Cleanup Log: Policy 'PolicyName' (Wed 07/10/2019 09:36:23 AM) Info All ...
Tag scan may fail after upgrade to v2.6.0 with the "KeyError: 'config_args'" error, when a resource is tagged with the "no-backup" tag value
Tag scan may fail after upgrade to v2.6.0 with the "KeyError: 'config_args'" error, when a resource is tagged with the "no-backup" tag value The following errors may be printed in CPM Agent logs: ERROR: scan_tags_request(agentapi_requests.py:415) ...
Installation/upgrade to v2.6.0 using the AMI method may fail when using proxy with error 500 displayed
Installation/upgrade to v2.6.0 using the AMI method may fail when using proxy with error 500 displayed: The following errors may be printed in the cpm_config.log file: ERROR: list_aliases(kms.py:42) Failed getting aliases list (region: us-east-1, ...
Upgrade using AMI to v2.6.0 from earlier versions may fail with the Configuration Wizard timing out and "Could not upgrade mysql"
Upgrade using AMI to v2.6.0 from earlier versions may fail with the Configuration Wizard timing out: The following error may be printed in the "config_shell_commands.log" file: upgrade_mysql_db(:60) Could not run mysql5.6 on docker, error=409 Client ...
Upgrade using AMI to v2.6.0 from earlier versions may fail with "Configuration Failed" and "Could not migrate database to version 2.6.0 scheme"
Upgrade using AMI to v2.6.0 from earlier versions may fail with "Configuration Failed". The following error may be printed in the "config_shell_commands.log" file: perform_shell_commands(:629) Could not migrate database to version 2.6.0 scheme, ...
S3 operation may be skipped repeatedly due to the “Copy operation is currently in progress” error
S3 operation may be skipped repeatedly due to the “Copy operation is currently in progress” error, even if there are no S3 operations in progress for the affected policy. The following message may be logged in the Backup Log: Warning - Copy ...
S3 backups may be skipped and S3 Cleanup may fail due to a locked policy
Customers may experience problems with S3 operations leaving backup chains locked, preventing further operation on that chain. Theses issues are especially prevalent in v2.4.x, please upgrade to the latest version ASAP to significantly reduce ...
Next page