Troubleshooting File Level Recovery (FLR) 3.2

Troubleshooting File Level Recovery (FLR) 3.2

Background:
File-level recovery requires N2WS to launch temporary worker instance in the target region.
The worker will read the snapshot directly or recover volumes in the background and attach them to a ‘worker’ instance launched for the operation.
The worker will be launched in the same account and region as the snapshots being explored, using a pre-defined worker configuration.

In addition to this Guide, you can find more information about this functionality in our User Guide, Chapter 13:  https://n2ws.gitbook.io/documentation/13-file-level-recovery

CPM Configuration

You need to make sure that you have a worker configured for each region & account where you are going to explore a snapshot,
The worker configuration can be accessed from the UI:
 

The AWS key that you are defining will enable you to login into the worker if additional troubleshooting or logs are needed
The other parameters are used by N2WS to know where to place the worker instance and which security group to use.

Note: in 3.2 the test worker button option will only test HTTPS connection.

AWS Configuration

File Level Restore Worker needs to be able to connect to the CPM Server over ports 22(SSH) and 443(HTTPS).
This means that you need to make sure that your AWS setting allow this communication to flow
  1. You need to make sure that this ports are opened outbound in the AWS security group of the worker
  2. You also need to make sure that this ports are opened inbound for the CPM server
In addition, in 3.2 we have started to use the new EBS Direct API related operations in FLR, this new API let us read the blocks directly from the snapshot.
There will be a check from the N2WS server to see if the API is available on the region and if there are permissions to use it.

If it is available then the worker will try to read the files directly from the snapshot using the API, this means that you need to make sure that this API is accessible from the worker VPC/SG.
If it the API is not available from the server, then the launched worker will work like in 3.1 - it will create a volume and read the files from it.

Troubleshooting

1. Communication 

You can test connectivity from the Worker by connecting to it over SSH(using "ubuntu" username) and running these commands:

For port 22: ssh cpm_server_ip_address
The result should be "cpmuser@cpmserveripaddress: Permission denied (publickey)."
If connection is blocked, you will receive (after a delay) "ssh: connect to host cpmserveripaddress port 22: Connection timed out"

For port 443:
 
wget --no-check-certificate https://cpmserveripaddress/
This command should result in status 302 (redirecting to "/signin/") followed by status 200

For EBS API:
Note: You need to replace us-east-1 with the relevant region,  the region to choose is the one where the snapshot to explore is located

Note: in order to be able to login to the worker instance over SSH you have to make sure that you have configured the workers to use a key pair in the worker configuration 

2. FLR and ELB
ELB can cause issue with the FLR due to timeout:
  1.  Link: File Level Recovery may fail if Elastic Load Balancing (ELB) is in use

3. Instance and Volume level FLR
You have the option in N2WS to explore the entire instance or a specific volume,
if you are facing issues with FLR for the instance, try to explore a specific volume - this will help determine if the issue is a specific volume or the entire instance.
In addition, exploring a specific volume should be quicker, so if you know which volume you need, this can speed up the recovery.


Click on recover volume for exploring specific volume:



Thanks for reading this guide,
N2WS Support Team.