Pular para o conteúdo principal

Drive path redundancy loss

A communication path with a drive has been lost.

Important
  • Correct this failure as soon as possible. If the drive's other port or any other component fails on the working channel, the drive will be failed. The Details area reports the affected shelf, drive, volumes, and the working channel over which it can communicate with the drive.

  • The fault indicator light on the affected drive will be off because the drive has not failed.

  • If instructed to fail and replace a drive, ensure the replacement drive has a capacity equal to or greater than the failed drive.

  • If instructed to replace a controller or an IOM canister, contact your Technical Support Engineer if you do not have a replacement controller or IOM canister available.

CAUTION
Risk of data loss

The NV status LED on the controller will stay lit until all cached data has been written to the storage array's drives. Removing the controller before the NV status LED has turned off will result in data loss.

The data on the affected volumes will be lost once you perform the next step. Be sure you have backed up your data before going to step 19.

CAUTION
Possible loss of data accessibility

Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

CAUTION
Electrostatic discharge can damage sensitive components

Always use proper anti-static protection when handling components. Touching components without using a grounding wrist strap may damage the equipment.

Recovery Steps

  1. Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

  2. Create a back-up of the storage array's configuration data on your local system using the Command Line Interface.

    From a command prompt run the save storageArray dbmDatabase command using the sourceLocation=onboard and contentType=all options.

    Command Prompt example: (exact syntax will be dependent on your operating system)
    SMcli (hostname or IP address) -c "save storageArray dbmDatabase sourceLocation=onboard contentType=all 
    file= \"dbmDatabase.zip\" ;"

    Help with E-Series CLI commands is available from the online Document Center or by contacting Technical Support.

  3. Is the affected shelf listed in the Details area a controller shelf or a drive shelf?

    • If the affected shelf is a controller shelf, go to step 11.

    • If the affected shelf is a drive shelf, go to step 4.

  4. To determine the non-working channel, start at the expansion port on the controller shelf corresponding to the working channel (refer to the labels on the back of the controller shelf if needed). Trace the cable from the working channel to the IOM canister in the affected drive shelf reported in the Details area.

  5. Locate the other IOM canister in the affected drive shelf (this is the canister on the non-working channel).

  6. You need to replace the IOM canister. Contact your Technical Support Engineer if you do not have a replacement IOM canister.

  7. Does your storage array have one or two controllers?

    • If the storage array has one controller, go to step 8.

    • If the storage array has two controllers, go to step 9.

  8. Perform the following steps to replace the IOM canister on the non-working channel.

    1. Label the cables to the IOM canister on the non-working channel. The labels will help you correctly reconnect the cables to the new IOM canister.

    2. Stop I/O from all hosts connected to the storage array and wait at least five minutes to ensure that any data in the controller's cache is written to the storage array's drives.

    3. Select the Save icon in the Recovery Guru to save the remaining steps to a file. These steps will no longer be accessible after you complete step d.

    4. Turn off power to all power-fan canisters in the controller shelf.

    5. Remove the IOM canister.

    6. Set all switches on the new IOM canister to the same values as the old IOM canister.

    7. Insert the new IOM canister into the drive shelf.

    8. Using the labels created in step a, reconnect the cables to the replaced canister.

    9. Wait one minute.

    10. Turn on power to all power-fan canisters in the controller shelf. Wait until all drives have completed the spin-up process, then go to step 10.

  9. Perform the following steps to replace the IOM canister on the non-working channel.

    1. Label the cables to the IOM canister on the non-working channel. The labels will help you correctly reconnect the cables to the new IOM canister.

    2. Remove the IOM canister.

    3. Set all switches on the new IOM canister to the same values as the old IOM canister.

    4. Insert the new IOM canister into the drive shelf.

    5. Using the labels created in step a, reconnect the cables to the replaced canister.

    6. Wait one minute.

  10. Select Recheck to ensure the problem has been resolved. If the problem is not resolved, go to step 11.

  11. Stop all I/O to all volumes in volume groups or disk pools associated with the affected drive. If another drive fails in the volume group or disk pool while you are performing this procedure, you will lose data.

  12. Reseating the drive may clear up the path redundancy problem.

    1. Remove the drive and then re-insert it.

    2. Wait one minute.

  13. Select Recheck to ensure the problem has been resolved. If the problem is not resolved, go to step 14.

  14. You must replace the drive. The procedure you use depends on the RAID level of the volume group associated with the affected drive. Is the RAID level listed in the Details area RAID 0?

    • If yes, ensure all of the volumes in the Storage > Volumes table are Optimal, and then go to step 15.

    • If no, go to step 24.

  15. Cancel any volume copy operations that include affected volumes listed in the Details area.

    1. Go to Home.

    2. Select the Show operations in progress link.

    3. Search the table for volume copy operations that include an affected volume. If a volume copy operation is found, select Stop to cancel the volume copy operation.

    4. Repeat step c until all volume copy operations including all affected volumes have been stopped.

  16. Delete any snapshot volumes associated with the affected volumes in the Details area. The snapshot volumes will no longer be valid after you fail the drive in step 19.

    1. Go to Storage > Snapshots. Then, select the Snapshot Volumes tab.

    2. Search for affected volumes in the table using the Name column. If the volumes are listed in the table they are snapshot volumes, otherwise they are not.

    3. If a snapshot volume is found that is associated with an affected volume, select Uncommon Tasks > Delete to delete the snapshot volume.

    4. Repeat step c until all snapshot volumes associated with affected volumes have been deleted.

  17. Select the Save icon in the Recovery Guru to save the remaining steps to a file. These steps will no longer be accessible after you complete step 19.

  18. Back up all data on the affected volumes listed in the Details area.

  19. Physically replace the affected drive.

    1. Go to Hardware.

    2. Highlight the drive listed in the Details area.

    3. Select Fail to fail the drive, making sure to unselect Copy contents of drive before failing.

    4. Remove the failed drive (its fault indicator light should be on).

    5. Wait 30 seconds.

    6. Insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

  20. Initialize all of the affected volumes.

    1. Go to Storage > Volumes.

    2. Select Initialize volumes under More to initialize all of the volumes shown in the Details area.

  21. Wait until all of the volume initialization operations have completed.

    The volume initialization operations' progress can be viewed by going to Home and selecting View operations in progress.

  22. Add the affected volumes back to the operating system, and restore the affected volumes' data from backup. You may need to reboot the system to see the re-initialized volumes.

  23. Select Recheck to ensure the problem has been resolved. If the problem is not resolved, a problem has occured on the controller and it needs to be replaced. Go to step 27.

  24. Although not required, you should backup all data on all volumes associated with the affected drive.

  25. Follow the steps to fail the drive.

    1. Go to Hardware.

    2. Highlight the affected drive.

    3. Select Fail from the drive's menu options.

    4. Wait 30 seconds and then insert a new drive. Its fault indicator light may be lit for a short time (one minute or less).

  26. Select Recheck to ensure the problem has been resolved. If the problem is not resolved, a problem has occured on the controller and it needs to be replaced. Go to step 27.

  27. Does your storage array have one or two controllers?

    • If the storage array has one controller, go to step 28.

    • If the storage array has two controllers, go to step 36.

  28. Stop I/O from all hosts connected to the storage array and wait at least five minutes to ensure that any data in the controller's cache is written to the storage array's drives.

  29. Select the Save icon in the Recovery Guru to save the remaining steps to a file. These steps will no longer be accessible after you complete step 30.

  30. Turn off power to all power-fan canisters in the controller shelf.

  31. Label each cable connected to the controller canister.

  32. Remove the controller.

  33. Remove the battery from the removed controller and place it into the new controller. Refer to your hardware documentation for the battery replacement procedure.

  34. Wait one minute, and then insert the new controller firmly into place and reconnect the cables.

  35. Turn on power to all power-fan canisters in the controller shelf. Wait until all drives have completed the spin-up process, and then go to step 43.

  36. Are any hosts connected to this storage array NOT running a host-based, multi-path failover driver?

    • If yes, stop I/O from those hosts to the storage array, and then go to step 37.

    • If no, go to step 37.

  37. Select the Save icon in the Recovery Guru to save the remaining steps to a file. These steps will no longer be accessible after you complete step 38.

  38. Manually place the affected controller offline.

    1. Go to Hardware.

    2. Highlight the controller shown in the Details area. The controller is located on the backside of the controller shelf.

    3. Select Place Offline.

  39. Select Recheck to rerun the Recovery Guru. The original Drive Path Redundancy Loss problem should now be replaced by an Offline Controller problem.

  40. Follow the recovery steps in the Offline Controller procedure until you have removed the controller.

  41. Remove the battery from the removed controller and place it into the new controller. Refer to your hardware documentation for the battery replacement procedure.

  42. Wait one minute, and then insert the new controller firmly into place and reconnect the cables.

  43. Select Recheck to ensure the problem has been resolved.