Troubleshooting File Level Recovery (FLR) 3.0

Troubleshooting File Level Recovery (FLR) 3.0

Background:
File-level recovery requires N2WS to recover volumes in the background and attach them to a ‘worker’ instance launched for the operation,
The worker will be launched in the same account and region as the snapshots being explored, using a pre-defined worker configuration.

In addition to this Guide, you can find more information about this functionality in our User Guide, Chapter 13:  Documentation

CPM Configuration

You need to make sure that you have a worker configured for each region & account where you are going to explore a snapshot,
The worker configuration can be accessed from the UI:



The AWS key that you are defining will enable you to login into the worker if additional troubleshooting or logs are needed
The other parameters are used by N2WS to know where to place the worker instance.

AWS Configuration

File Level Restore Worker needs to be able to connect to the CPM Server over ports 22(SSH) and 443(HTTPS),
This means that you need to make sure that your AWS setting allow this communication to flow
  1. You need to make sure that this ports are opened outbound in the AWS security group of the worker
  2. You also need to make sure that this ports are opened inbound for the CPM server

Testing

Once you finished the configuration, you can test the worker configuration and communication,
See this  KB Article which explains the steps to test a CPM Worker configuration
  1. Link: How to test the worker configuration for CPM 3.0

Troubleshooting

1. Communication 
You can test connectivity from the Worker also by connecting to it over SSH(using "ubuntu" username) and running these commands:

For port 22: ssh cpm_server_ip_address
The result should be "cpmuser@cpmserveripaddress: Permission denied (publickey)."
If connection is blocked, you will receive (after a delay) "ssh: connect to host cpmserveripaddress port 22: Connection timed out"

For port 443:
 wget --no-check-certificate https://cpmserveripaddress/
This command should result in status 302 (redirecting to "/signin/") followed by status 200

***Note: in order to be able to login to the worker instance over SSH you have to make sure that you have configured the workers to use a key pair in the worker configuration 

2. FLR and ELB
ELB can cause issue with the FLR due to timeout:
  1.  Link: File Level Recovery may fail if Elastic Load Balancing (ELB) is in use

3. Instance and Volume level FLR
You have the option in N2WS to explore the entire instance or a specific volume,
if you are facing issues with FLR for the instance, try to explore a specific volume - this will help determine if the issue is a specific volume or the entire instance.
In addition, exploring a specific volume should be quicker, so if you know which volume you need, this can speed up the recovery.


Click on recover volume for exploring specific volume:


4. Worker cleanup
If user starts the FLR and then closes the Browser Tab instead of clicking the 'Close' button,
Then the worker instance might not get deleted:


In this case you need to go to the Server Settings (cog icon) -> cleanup tab -> 'Clear Explore Sessions' icon


Diagram

Below diagram is aimed to help illustrate the setup of the worker, This is just one example and might change based on your configuration/settings.

For exact technical details, please see above KB instruction and our User Guide.




Thanks for reading this guide,
N2WS Support Team.