system node show-memory-errors
Display Memory Errors on DIMMs
Availability: This command is available to cluster administrators at the advanced privilege level.
Description
system node show-memory-errors prints the history of memory (storage controller's RAM) errors since boot. This command can be useful in diagnosing memory problems or determining which DIMM, if any, might need replacement. Some correctable ECC errors are to be expected under normal operation, but many occurring on a particular DIMM might indicate a problem. All the fields are read only and can be used to filter the output. The maximum number of physical address and timestamps reported is 160.
Parameters
- { [-fields <fieldname>, ...]
- If you specify the -fields <fieldname>, ... parameter, the command output also includes the specified field or fields. You can use '-fields ?' to display the fields to specify.
- | [-verbose ]
- The -verbose parameter enables verbose mode, resulting in the display of more detailed output.
- | [-instance ]}
- If you specify the -instance parameter, the command displays detailed information about all fields.
- [-node {<nodename>|local}] - Node
- When provided, the -node parameter specifies the nodes for which the memory error statistics are to be displayed. When the -node is not provided, the command is applied to all the nodes in the cluster.
- [-id <integer>] - DIMM ID
- This parameter refers to the DIMM ID. It can be used to look at the correctable ECC error count on a specific DIMM.
- [-name <text>] - DIMM Name
- This parameter specifies the DIMM name for which the memory error statistics are to be displayed.
- [-cecc <integer>] - Correctable ECC Error Count
- This parameter can be used to get all the DIMMs with the specified correctable ECC error count.
- [-merr {true|false}] - Multiple Errors on Same Address
- Use this parameter with the values true to specify whether the error was seen multiple times on the same physical address. It can also be used to look at all the DIMMs with multiple errors on same address.
- [-timestamp <text>, ...] - Error Time
- This specifies the time at which the error was seen on the DIMM.
- [-addr <text>, ...] - Error Address
- This specifies the physical address on which the error was seen.
Examples
cluster1::*> system node show-memory-errors
Correctable ECC Memory Errors:
Node: localhost
DIMM CECC Multiple Err
Name Count Same Address
------- ------ ------------
DIMM-1 0 false
DIMM-2 0 false
DIMM-3 0 false
DIMM-4 0 false
DIMM-5 4 true
DIMM-6 1 false
DIMM-7 1 false
DIMM-8 0 false
8 entries were displayed.
cluster1::*> system node show-memory-errors -verbose
Correctable ECC Memory Errors:
Node: localhost
DIMM CECC Multiple Err Physical
Name Count Same Address Timestamp Address
------- ------ ------------ -------------------- -------------
DIMM-1 0 false - -
DIMM-2 0 false - -
DIMM-3 0 false - -
DIMM-4 0 false - -
DIMM-5 4 true 12/02/2013 08:17:43 0xD640
12/02/2013 08:17:57 0x3F7FF800
12/02/2013 08:18:03 0x11743D000
12/02/2013 08:18:37 0x11743D000
DIMM-6 1 false 12/02/2013 08:17:53 0x87EC0
DIMM-7 1 false 12/02/2013 08:17:51 0x13DED8900
DIMM-8 0 false - -
8 entries were displayed.
Give documentation feedback