Retreive support logging from a failed S3 worker root volume

How to retreive support logging from a CPM Worker Machine


In cases where an S3 copy is failing, support may request logging from the CPM S3 worker appliance.

This logging can be collected from the retained root volume of a failed worker or by setting termination protection on a running CPM worker instance.

Performing these steps will require attaching and mounting the worker volume to the CPM server or other Linux server for log retrieval.



Step 1: Determine if the worker root volume has been retained:

When the S3 copy task fails with the below message, the worker root volume has automatically been retained and can be used for collecting logs, go to step 2a

  • Error - Backup copy failed
  • Info - Root volume of failed worker i-1234567890abcdef (vol-1234567890abcdef) has been retained
The retained worker volume will be automatically deleted once there is a successful S3 copy

If this message does not appear, or copy task has not yet failed:

With the CPMWorkerMachine instance running, enable "termination protection" on the instance in AWS, then wait for the S3 copy task to fail.
(If the S3 copy fails without producing the above error message, run the copy task again, then enable termination protection)



NOTE: When termination protection is enabled, the worker will not stop running or terminate automatically. It will need to be manually terminated after logs are collected.

When the copy task is failed, go to step 2b.



Step 2a: Attach the failed worker root volume to an instance:

If the root volume has been retained from the failure, first find the detached worker volume in EBS. The volume ID can be found in the backup log.
If the volume cannot be found, double check the session in Backup Manager shows "root volume has been retained" and the volume ID is not mistyped in AWS.

Next, attach this volume to the CPM server instance or other Linux instance.

            
If the CPM server instance cannot be found when attaching the volume, check the region and availability zone of the worker volume match the CPM server.

Once attached, go to step 3.
 

Step 2b: Create a copy of the termination protected worker's volume and attach :

In most cases this should only be done after the S3 copy shows as failed in the CPM console.


      1. Find the worker appliance in AWS and click its Root-device from the description, then click the EBS ID.

           
     

      2. Create a snapshot of the volume and go to the snapshot by following the link on the created snapshot dialog:

           
           


      3. Create a volume from this snapshot. While creating the volume be aware of the availability zone, it will need to match the CPM server or Linux server being used to mount the volume.
          Once the volume has finished creating, click the link on the "Request Succeeded" page.
           
           
           

      4. Finally, attach the volume to the CPM server or other Linux instance of your choosing and go to Step 3.

           



Step 3: Mount the volume within Linux:

Mounting the worker volume to the CPM server will allow retrieving the logs using "download logs" in the CPM UI, please follow the below steps carefully

Once attached, the volume needs to be mounted within the instance.
The below bash commands are specific to the CPM server
  1. Using an SSH client of your choosing, make a connection and log into the CPM server.

    The CPM server uses key authentication (key selected in AWS when the CPM server was first launched) and user: cpmuser




  2. Use the following command to find the volume:

          sudo lsblk

         


  3. The S3 worker partition will be "part" in the TYPE column and will have a blank MOUNTPOINT,

         
          The correct partition is "xvdg1" in this example.

    Note: the partition name may be different on your CPM server.
    Be sure to check you have found the correct partition!


  4. Create a mount point for the worker partition by running these commands:

          cd /
          sudo mkdir /worker

         


  5. Now mount the worker volume to the CPM server with the following command, replace "<partition name from 3.>" with the name of the partition found in 3. above:

          sudo mount /dev/<partition name from 3.> /worker

         


  6. Confirm the mount by going to /worker and listing the directory. You should see the file system of the worker volume:

          cd /worker
          ls

         




Step 4: Getting worker logs from the mounted volume:

  1. Compress the entire log folder

          sudo tar -zcvf /tmp/workerlogpack.tar.gz  /worker/var/log

         


  2. Find workerlogpack.tar.gz in /tmp

          cd /tmp
          ls

         

At this point, workerlogpack.tar.gz can be copied from the instance using FTP or another method, and this log package can be attached to the support case. See below for clean up.

Alternatively, if the worker volume has been mounted to the CPM server, its possible to copy this log pack to the CPM log directory, then retrieve the log package in the UI.
  1. To copy the log package from /tmp to the CPM log directory run this command:

           sudo cp /tmp/workerlogpack.tar.gz /var/log/cpm/workerlogpack.tar.gz

         

    The copy can be confirmed with this command:

          ls /var/log/cpm

         


  2. Next, generate the log package from the CPM server by clicking "download logs"

         


  3. Finally, upload this CPM log package to the support case.


The next steps show how to remove the mount and clean up created files.


Step 5: Clean up:

  1. Unmount the volume with these commands:
          cd /
          sudo umount /dev/xvdg1


  2. Remove the mount point folder with this command:
          sudo rmdir /worker


  3. Remove the temporary log package with this command:
          sudo rm /tmp/workerlogpack.tar.gz

  4. If logs were copied to the CPM server log directory, remove the log package with this command:
          sudo rm /var/log/cpm/workerlogpack.tar.gz

  5. The worker root volume can now be detached from the instance in AWS. You can delete this volume, and any snapshots created from the worker appliance.

  6. If termination protection was enabled, disable it and terminate the CPM worker appliance.