Skip to main content

Running device failure diagnostics

Running diagnostics can help you determine why access to a specific device becomes intermittent or why the device becomes unavailable in your storage system.

  1. At the storage system prompt, switch to the LOADER prompt: halt
  2. Enter the following command at the LOADER prompt: boot_diags
    Note
    You must run this command from the LOADER prompt for system-level diagnostics to function properly. The boot_diags command starts special drivers designed specifically for system-level diagnostics.
  3. Run diagnostics on the device causing problems by entering the following command: sldiag device run [-dev devtype|mb|slotslotnum] [-name device]
    • -dev devtype specifies the type of device to be tested.
      • ata is an Advanced Technology Attachment device.
      • bootmedia is the system booting device.
      • cna is a Converged Network Adapter not connected to a network or storage device.
      • env is motherboard environmentals.
      • fcache is the Flash Cache adapter, also known as the Performance Acceleration Module 2.
      • fcal is a Fibre Channel-Arbitrated Loop device not connected to a storage device or Fibre Channel network.
      • fcvi is the Fiber Channel Virtual Interface not connected to a Fibre Channel network.
      • interconnect or nvram-ib is the high-availability interface.
      • mem is system memory.
      • nic is a Network Interface Card not connected to a network.
      • nvram is nonvolatile RAM.
      • nvmem is a hybrid of NVRAM and system memory.
      • sas is a Serial Attached SCSI device not connected to a disk shelf.
      • serviceproc is the Service Processor.
      • storage is an ATA, FC-AL, or SAS interface that has an attached disk shelf.
      • toe is a TCP Offload Engine, a type of NIC.
    • mb specifies that all the motherboard devices are to be tested.
    • slot slotnum specifies that a device in a specific slot number is to be tested.
    • -name device specifies a given device class and type.
  4. View the status of the test by entering the following command: sldiag device status
    Your storage system provides the following output while the tests are still running:
    There are still test(s) being processed.

    After all the tests are complete, the following response appears by default:
    *> <SLDIAG:_ALL_TESTS_COMPLETED>

  5. Identify any hardware problems by entering the following command: sldiag device status [-dev devtype|mb|slotslotnum] [-name device] -long -state failed

    Example

    The following example shows how the full status of failures resulting from testing the FC-AL adapter are displayed:

    *> sldiag device status fcal -long -state failed

    TEST START ------------------------------------------
    DEVTYPE: fcal
    NAME: Fcal Loopback Test
    START DATE: Sat Jan 3 23:10:56 GMT 2009

    STATUS: Completed
    Starting test on Fcal Adapter: 0b
    Started gathering adapter info.
    Adapter get adapter info OK
    Adapter fc_data_link_rate: 1Gib
    Adapter name: QLogic 2532
    Adapter firmware rev: 4.5.2
    Adapter hardware rev: 2

    Started adapter get WWN string test.
    Adapter get WWN string OK wwn_str: 5:00a:098300:035309

    Started adapter interrupt test
    Adapter interrupt test OK

    Started adapter reset test.
    Adapter reset OK

    Started Adapter Get Connection State Test.
    Connection State: 5
    Loop on FC Adapter 0b is OPEN

    Started adapter Retry LIP test
    Adapter Retry LIP OK

    ERROR: failed to init adaptor port for IOCTL call

    ioctl_status.class_type = 0x1

    ioctl_status.subclass = 0x3

    ioctl_status.info = 0x0
    Started INTERNAL LOOPBACK:
    INTERNAL LOOPBACK OK
    Error Count: 2 Run Time: 70 secs
    >>>>> ERROR, please ensure the port has a shelf or plug.
    END DATE: Sat Jan 3 23:12:07 GMT 2009

    LOOP: 1/1
    TEST END --------------------------------------------


    If the system-level diagnostics tests...Then...
    Resulted in some test failuresDetermine the cause of the problem.
    1. Exit Maintenance mode by entering the following command: halt
    2. Perform a clean shutdown and disconnect the power supplies.
    3. Verify that you have observed all the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
    4. Reconnect the power supplies and power on the storage system.
    5. Repeat Steps 1 through 5 of Running device failure diagnostics.
    Resulted in the same test failuresTechnical support might recommend modifying the default settings on some of the tests to help identify the problem.
    1. Modify the selection state of a specific device or type of device on your storage system by entering the following command: sldiag device modify [-dev devtype|mb|slotslotnum] [-name device] [-selection enable|disable|default|only]

      -selection enable|disable|default|only allows you to enable, disable, accept the default selection of a specified device type or named device, or only enable the specified device or named device by disabling all others first.

    2. Verify that the tests were modified by entering the following command: sldiag option show
    3. Repeat Steps 3 through 5 of Running device failure diagnostics.
    4. After you identify and resolve the problem, reset the tests to their default states by repeating substeps 1 and 2.
    5. Repeat Steps 1 through 5 of Running device failure diagnostics.
    Were completed without any failuresThere are no hardware problems and your storage system returns to the prompt.
    1. Clear the status logs by entering the following command: sldiag device clearstatus [-dev devtype|mb|slotslotnum]
    2. Verify that the log is cleared by entering the following command: sldiag device status [-dev devtype|mb|slotslotnum]

      The following default response is displayed:

      SLDIAG: No log messages are present.

    3. Exit Maintenance mode by entering the following command: halt
    4. Enter the following command at the Loader prompt to boot the storage system: boot_ontap

    You have completed system-level diagnostics.

After you finish

If the failures persist after repeating the steps, you need to replace the hardware.