How to retrieve logs from a CPM AWS Worker instance

How to retrieve logs from a CPM AWS Worker instance

Linux & AWS knowledge is required 
Please read the entire KB before starting.

N2WS uses temporary EC2 worker instances for several operations (copy to S3, FLR, etc),  In cases where a worker is failing before it could communicate with the main server, the logs will not be automatically copied to the main server. in such cases support may request logs directly from the EC2 worker instance.

These logs can be collected in two ways
  1. by setting termination protection on a running EC2 worker instance and SSH into it.
  2. Configure the server to retain worker or worker root volume of a failed worker



Option 1: Setting termination protection

If the issue is easy to reproduce, you can run the scenario, enable termination protection on the instance, connect to it and collect logs manually

Step 1: Verify keyPair is configured

Go to N2WS UI -> Worker Configuration, then make sure that the configuration for the relevant worker has a AWS SSH keyPair configured


If no Key Pair is configured, Select the relevant configuration, click on "Edit" and select a keyPair

Step 2: re-run scenario and enable termination protection

Rerun the scenario (worker test or Copy to S3 or File Level recovery), when the worker is launch enable termination protection on it from the AWS EC2 console


Wait for the scenario to fail and continue to next step.

Step 3: Connect to the worker instance & Collect logs

Connect to the worker instance with username ubuntu and the KeyPair you configured. you can use software such as MobaXterm which also allow to easily copy files out of the instance.
Collect the following from the instance:
  1. the entire folder: /var/log/cpm/
  2. file: /var/log/cloud-init-output.log
  3. file: /var/log/syslog.log
  4. file: /var/log/cloud-init.log

Step 4: Cleanup

  1. Disable the termination protection !!
  2. Terminate the worker


Option 2: Keep root volume or worker

If the issue is happening intermittent or you have many instances and not sure which worker is the relevant one, you can configure CPM to keep either the root volume of the worker or the entire worker.

Step 1: Verify keyPair is configured

Go to N2WS UI -> Worker Configuration, then make sure that the configuration for the relevant worker has a AWS SSH keyPair configured


If no Key Pair is configured, Select the relevant configuration, click on "Edit" and select a keyPair

Step 2: Configure keeping the temp resources

Connect to the
N2WS server via SSH (user is cpmuser), and do the following

1. Edit or create this file:  /cpmdata/conf/cpmserver.cfg
2. Add one of the following:
keep_failed_proxy -  this option will tell the server to keep the entire worker instance
keep_root_volume_of_failed_proxy - this option will detach and keep only the root volume of the worker instance
[backup_copy]
keep_failed_proxy=True

3. when no backup is running, restart apache server: 
sudo service apache2 restart

Step 3: re-run scenario or wait for issue to happen again, then collect logs.

re-run scenario or wait for issue to happen again. then check the relevant region for the leftover worker instance/root volume, or check the logs for info message with the details
Info - Root volume of failed worker i-1234567890abcdef (vol-1234567890abcdef) has been retained

Step 4: Collect logs.

If you configured to keep the entire worker instance, start it and then connect to the worker instance with username ubuntu and the KeyPair you configured.
you can use software such as MobaXterm which also allow to easily copy files out of the instance.

Then collect the following from the instance:
  1. the entire folder: /var/log/cpm/
  2. file: /var/log/cloud-init-output.log
  3. file: /var/log/syslog.log
  4. file: /var/log/cloud-init.log

If you configured to keep only the root volume, create a Linux instance, attach & mount the volume to the instance and collect the logs mentioned above
This AWS article explains how to mount a volume to an instance https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html

Step 5: Cleanup

  1. After you completed collecting logs, don't forget to delete the worker instance/root volume
  2. Disable the configuration added in step2 by connecting to N2WS server via ssh, edit the cpmserver.cfg and set the values to False.

  


    • Related Articles

    • How To Test Connectivity from a CPM Worker to AWS endpoints

      The Following Steps will help you Test the outgoing connection from the CPM Worker to AWS endpoints OR the CPM Server if you need to test to ensure that the Worker can reach the CPM server once it launches. Launch Worker First, ensure the CPM Worker ...
    • Troubleshooting File Level Recovery (FLR) communication issue

      CPM Configuration File-level recovery requires N2WS to recover volumes in the background and attach them to a temporary EC2 ‘worker’ launched for the operation, The worker will be launched in the same account and region as the snapshots being ...
    • "Worker did not establish connection" and "worker did not complete initializing" errors during S3 and FLR

      During S3 operations, you may encounter the message "Worker i-... did not establish connection" in the log of an S3 copy or S3 restore operation. Error - Worker i-1234567890abcdef did not establish connection - terminating operation During File Level ...
    • How to change instance type of S3 worker instances CPM

      Some clients may need worker instances with more ram and CPU power. You can use this process to modify the ec2 instance type used for worker instances. We highly recommend not to change size unless it was suggested by the N2WS support team. Otherwise ...
    • How to test the worker configuration from UI

      Background: This document explains the steps to test a CPM Worker configuration. This can help reduce errors during S3 copy and File-Level restores by being able to confirm the settings used for these jobs are successfully able to connect. Worker ...