The integrated management module (IMM) consolidates the service processor functionality, video controller, and remote presence capabilities in a single chip on the system board. The IMM monitors all components of the blade server and posts events in the IMM event log. In addition, most events are also sent to the advanced management module event log.
The following table lists IMM error messages that are displayed in the advanced management module event log and suggested actions to correct the detected problems. These events, in a slightly different format, are also displayed in the IMM event log.
If an action step is preceded by "(Trained service technician only)," that step must be performed only by a trained service technician.
Type
Error Message
Action
Error Code: 0x80010200
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
System board (Planar 12V) voltage under critical threshold. Reading: X, Threshold: Y
If the under voltage problem is occurring on all blade servers, look for other events in the log related to power and resolve those events (see Event logs).
View the event log provided by the advanced management module for your BladeCenter chassis and resolve any power related errors that might be displayed.
If other modules or blade servers are logging the same issue, check the power supply for the BladeCenter chassis.
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
System board (Planar 12V) voltage over critical threshold. Reading: X, Threshold: Y
If the over voltage problem is occurring on all blade servers, look for other events in the log related to power and resolve those events.
View the event log provided by the advanced management module for your BladeCenter chassis and resolve any power-related errors that might be displayed.
If other modules or blade servers are logging the same issue, check the power supply for the BladeCenter chassis.
Memory device X, temperature (DIMM X Temp) critical [Note:X=1-16]
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Error
Memory device X, temperature (MEU DIMM X Temp) critical [Note:X=1-24]
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Error
Group 1 (mem dev 1-40) memory (MEU Mem Lane) critical
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
Expansion Module X, temperature (BPE4_Y TMP) non-recoverable
Note
X = Module number
Y = BPE4 ID (1, 2, 3, or 4). The max BPE4 number is 4, and there are 2 slots in each BPE4. Module number is based on the configuration of the BPE4
Check the room ambient temperature to ensure that it is within the operating specifications for the chassis (see Features and specifications).
If an air filter is installed, make sure that it is cleaned or replaced (see the documentation for your BladeCenter chassis).
Make sure that all fan/blower modules are running. Replace fan modules if necessary (see the documentation for your BladeCenter chassis).
Make sure that a device or filler is installed in each bay in the front and rear of the chassis, and make sure that there is nothing covering the bays. Any missing components can cause a major reduction in airflow for the blade server (see the documentation for your BladeCenter chassis).
Expansion Module X, voltage (BPE4_Y VOL) non-recoverable
Note
X = Module number
Y = BPE4 ID (1, 2, 3, or 4). The max BPE4 number is 4, and there are 2 slots in each BPE4. Module number is based on the configuration of the BPE4
Check the room ambient temperature to ensure that it is within the operating specifications for the chassis (see Features and specifications).
If an air filter is installed, make sure that it is cleaned or replaced (see the documentation for your BladeCenter chassis).
Make sure that all fan/blower modules are running. Replace fan modules if necessary (see the documentation for your BladeCenter chassis).
Make sure that a device or filler is installed in each bay in the front and rear of the chassis, and make sure that there is nothing covering the bays. Any missing components can cause a major reduction in airflow for the blade server (see the documentation for your BladeCenter chassis).
Error
Processor X, temperature (CPU X OverTemp) non-recoverable [Note: X=1-2]
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Error
System board, temperature (Inlet Temp) non-recoverable
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Error
Proc/IO module 1, temperature (IOH Temp) non-recoverable
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Error Code: 0x80070600
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
Processor X, temperature (CPU X OverTemp) non-recoverable [Note: X=1-2]
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Error
System board, interconnect (Pwr Share Jumper) non-recoverable
Make sure all of the memory is enabled in the Setup utility (see Using the Setup utility). Notice which memory modules are disabled before you continue to the next step.
Make sure all of the memory is enabled in the Setup utility (see Using the Setup utility). Notice which memory modules are disabled before you continue to the next step.
Make sure all of the memory is enabled in the Setup utility (see Using the Setup utility). Notice which memory modules are disabled before you continue to the next step.
FW/BIOS, firmware progress (Firmware Error) no system memory
Make sure that the server contains the correct number of DIMMs of the correct DIMM type, in the correct order (see Installing a DIMM - BladeCenter HX5 for the correct order to install DIMMs).
Restart the blade server four times, using either the power button on the front of the blade server or the advanced management module web interface.
Remove the battery and reinstall the battery to clear CMOS memory and NVRAM. The real time clock will also be reset. See Removing the battery and Installing the battery.
Error
FW/BIOS, firmware progress (Firmware Error) no usable system memory
System mgmt software (Scale Config) hardware change detected
Information only; no action is required.
Note
This event is not displayed in the advanced management module event log. However, it is sent for alerts and SNMP traps.
Error Code: 0x806F0107
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
Group 4, processor (One of CPUs) thermal trip
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Processor X (CPU X Status) thermal trip [Note: X=1-2]
Make sure that the room temperature is within the operating specifications (see Features and specifications).
Make sure that none of the air vents on the BladeCenter chassis and on the blade server are blocked.
Make sure that all of the fans on the BladeCenter chassis are running.
Make sure that each bay of the BladeCenter chassis contains either a device or a filler.
Make sure that the blade server is not missing any heat sinks, DIMMs, or heat-sink fillers (see Parts listing - BladeCenter HX5).
(Trained service technician only) Make sure that the microprocessor heat sink is properly attached to the microprocessor (see Installing a microprocessor and heat sink).
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
Group 1, memory (One of the DIMMs) uncorrectable ECC memory error
Refer to TIP H21455 for minimum code level.
Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
Swap one of the DIMMs with a DIMM of the same size and type from another channel (see Installing a DIMM - BladeCenter HX5. For example, if a problem occurs on DIMM 1 and DIMM 4, swap DIMM 1 with a similar DIMM in slot 9.
Enable all affected DIMMs using the Setup utility.
If the failure remains on the original DIMM slots, replace the DIMM that was not moved. If the failure follows the DIMM that was moved, replace the DIMM that was swapped.
Memory device X (DIMM X Status) uncorrectable ECC memory error [Note: X=1-16]
Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
Swap one of the DIMMs with a DIMM of the same size and type from another channel (see Installing a DIMM - BladeCenter HX5. For example, if a problem occurs on DIMM 1 and DIMM 4, swap DIMM 1 with a similar DIMM in slot 9.
Enable all affected DIMMs using the Setup utility.
If the failure remains on the original DIMM slots, replace the DIMM that was not moved. If the failure follows the DIMM that was moved, replace the DIMM that was swapped.
Memory device X (MEU DIMM X Status) uncorrectable ECC memory error [Note: X=1-24]
Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
Swap one of the DIMMs with a DIMM of the same size and type from another channel (see Installing a DIMM - IBM MAX5.
Enable all affected DIMMs using the Setup utility.
If the failure remains on the original DIMM slots, replace the DIMM that was not moved. If the failure follows the DIMM that was moved, replace the DIMM that was swapped.
Group 4, processor (CPU Fault Reboot) OEM system boot event
Information only; no action is required.
Error Code: 0x806F0113
Error
Chassis (NMI State) bus timeout
Remove the blade server from the BladeCenter chassis; then, reinstall it.
Reseat all the optional devices installed in the blade server one device at a time, restarting the blade server each time, to determine where the problem is located.
Remove optional devices from the blade server one at a time to determine where the problem is located.
Replace the following components one at a time, in the order shown, restarting the blade server each time:
All optional devices installed in the blade server
System board, Power Module (EPOW Fault) predictive failure
This warning message generally indicates that power redundancy for the blade was lost. The non redundant condition may have subsequently transitioned back to redundant power state.
Check to see if a power module has been removed or replaced and ensure the power modules are installed and functioning properly (see the Installation and User's Guide for your chassis).
Memory device X (DIMM X Status) memory scrub failed [Note: X=1-16]
Refer to TIP H21455 for minimum code level.
Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see Installing a DIMM - BladeCenter HX5 for memory population sequence).
Enable all affected DIMMs using the Setup utility.
If the failure remains on the original DIMM slots, replace the DIMM that was not moved. If the failure follows the DIMM that was moved, replace the DIMM that was swapped.
A software nonmaskable interrupt (NMI) has been detected.
Check the operating system event log for any related errors and resolve those errors. If you cannot resolve those errors, contact the appropriate service provider for the software.
Check the application log for any related errors and resolve those errors. If you cannot resolve those errors, contact the applicable service provider for the software.
Check the IBM Support web page for any service bulletins that might be related to this problem.
Error Code: 0x806F032B
Error
System mgmt software (Scale Config) software incompatibility
If this error is occurring on a blade server operating in stand-alone mode, update the FPGA firmware.
If this error is occurring on a blade server that is part of a scalable blade complex operating in single partition mode, make sure that the firmware for all IMMs and Field Programmable Gate Arrays (FPGAs) are at the same level in the blade complex.
Error Code: 0x806F0409
Information
System board (Host Power) AC lost
Information only; no action is required.
Note
This event isl not be displayed in the advanced management module event log. However, it is sent for alerts and SNMP traps.
Error Code: 0x806F040C
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Information
Group 1, (One of the DIMMs) memory disabled
Check the system event log for any memory errors that might be related to the specified DIMM and resolve those errors.
Y = BPE4 ID (1, 2, 3, or 4). The max BPE4 number is 4, and there are 2 slots in each BPE4. Module number is based on the configuration of the BPE4
If you have a PCI adapter in your blade server, verify that the PCI adapter is supported in the blade server. For a list of supported optional devices for the blade server, see the http://www.ibm.com/servers/eserver/serverproven/compat/us/.
Make sure that both processors are displayed by the system.
Load the default settings.
Go to the System Settings menu and make sure the processor is enabled.
Error Code: 0x806F050C
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Error
Group 1, memory (One of the DIMMs) correctable ECC memory error logging limit reached
Refer to TIP H21455 for minimum code level.
Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see Installing a DIMM - BladeCenter HX5 for memory population sequence).
If the error still occurs on the same DIMM, replace the affected DIMM (as indicated by the error LEDs on the system board or the event logs).
Memory device X (DIMM X Status) correctable ECC memory error logging limit reached [Note X = 1-16]
Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see Installing a DIMM - BladeCenter HX5 for memory population sequence).
If the error still occurs on the same DIMM, replace the affected DIMM (as indicated by the error LEDs on the system board or the event logs).
Expansion Module X, (BPE4_Y Slot Z) PCI system error
Note
X = Module number
Y = BPE4 ID (1, 2, 3, or 4). The max BPE4 number is 4, and there are 2 slots in each BPE4. Module number is based on the configuration of the BPE4
If you have a PCI adapter in your blade server, verify that the PCI adapter is supported in the blade server. For a list of supported optional devices for the blade server, see the http://www.ibm.com/servers/eserver/serverproven/compat/us/.
If you receive this error code for a blade server configuration that includes an IBM MAX5 expansion blade, multiple DIMMs might have been disabled. After you replace the mismatched DIMM, make sure that all DIMMs have been enabled using the Setup utility (see Using the Setup utility) or using the Advanced Settings Utility (see Using the Advanced Settings Utility (ASU)).
Error
Group 1, memory (One of the DIMMs) memory configuration error
If you receive this error code for a blade server configuration that includes an IBM MAX5 expansion blade, multiple DIMMs might have been disabled. After you replace the mismatched DIMM, make sure that all DIMMs have been enabled using the Setup utility (see Using the Setup utility) or using the Advanced Settings Utility (see Using the Advanced Settings Utility (ASU)).
Error
Memory device X (DIMM X Status) memory configuration error [Note X=1-16]
If you receive this error code for a blade server configuration that includes an IBM MAX5 expansion blade, multiple DIMMs might have been disabled. After you replace the mismatched DIMM, make sure that all DIMMs have been enabled using the Setup utility (see Using the Setup utility) or using the Advanced Settings Utility (see Using the Advanced Settings Utility (ASU)).
Error
Memory device X (MEU DIMM X Status) memory configuration error [Note X=1-24]
Make sure that the DIMMs are installed in the correct order and configured correctly (see Installing a DIMM - IBM MAX5).
If you receive this error code for a blade server configuration that includes an IBM MAX5 expansion blade, multiple DIMMs might have been disabled. After you replace the mismatched DIMM, make sure that all DIMMs have been enabled using the Setup utility (see Using the Setup utility) or using the Advanced Settings Utility (see Using the Advanced Settings Utility (ASU)).
Error Code: 0x806F070D
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Information
Hard drive X (SSD Exp Drive X) rebuild in progress
Information only; no action is required.
Information
Hard drive X (SSD Exp Drive X) rebuild complete
Information only; no action is required.
Error Code: 0x806F0807
Note
Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions.
Information
Group 4, processor (One of CPUs) disabled
Check the event logs for other related error messages (see Event logs).
Make sure all of the memory is enabled in the Setup utility (see Using the Setup utility). Notice which memory modules are disabled before you continue to the next step.