Skip to main content

Reliability, availability, and serviceability

Three important computer design features are reliability, availability, and serviceability (RAS). The RAS features help to ensure the integrity of the data that is stored in the server, the availability of the server when you need it, and the ease with which you can diagnose and correct problems.

Your server has the following RAS features:
  • 3-year parts and 3-year labor limited warranty
  • 24-hour support center
  • Automatic error retry and recovery
  • Automatic restart on nonmaskable interrupt (NMI)
  • Automatic restart after a power failure
  • Backup Unified Extensible Firmware Interface (UEFI) switching under the control of the integrated management module II
  • Built-in monitoring for fan, power, temperature, voltage, and power-supply redundancy
  • Cable-presence detection on most connectors
  • Chipkill memory protection
  • Single-device data correction (SDDC) for x4 DRAM technology DIMMs. Ensures that data is available on a single x4 DRAM DIMM after a hard failure of up to two DRAM DIMMs. One x4 DRAM DIMM in each rank is reserved as a space device.
  • Diagnostic support for ServeRAID and Ethernet adapters
  • Error codes and messages
  • Error correcting code (ECC) L3 cache and system memory
  • Full Array Memory Mirroring (FAMM) redundancy
  • Hot-swap cooling fans with speed-sensing capability
  • Hot-swap hard disk drives
  • Hot-swap power supplies
  • Information and light path diagnostics LED panels
  • Integrated management module II
  • Light path diagnostics LEDs for DIMMs, microprocessors, hard disk drives, solid state drives, power supplies, and fans
  • Memory mirroring and memory sparing support
  • Memory error correcting code and parity test
  • Memory downsizing (non-mirrored memory). After a restart of the server after the memory controller detects a non-mirrored uncorrectable error and the memory controller cannot recover operationally, the IMM logs the uncorrectable error and informs the power-on self-test. The power-on self-test logically maps out the memory with the uncorrectable error, and the server restarts with the remaining installed memory.
  • Menu-driven setup, system configuration, and redundant array of independent disks (RAID) configuration programs
  • Microprocessor built-in self-test (BIST), internal error signal monitoring, internal thermal trip signal monitoring, configuration checking, and microprocessor and voltage regulator module failure identification through light path diagnostics
  • NMI button
  • Parity checking on the PCIe buses
  • Power management: compliance with Advanced Configuration and Power Interface (ACPI)
  • Power-on self-test
  • Predictive Failure Analysis (PFA) alerts on memory, SAS/SATA hard disk drives or solid state drives
  • Redundant hot-swap power supplies and redundant hot-swap fans
  • Remind button to temporarily turn off the system-error LED
  • Remote system problem-determination support
  • ROM-based diagnostics
  • ROM checksums
  • Serial Presence Detection (SPD) on memory, vital product data (VPD) on system board, power supply, and hard disk drive or solid state drive backplanes, microprocessor and memory expansion tray, and Ethernet adapters
  • Single-DIMM isolation of excessive correctable error or multi-bit error by the UEFI
  • Solid-state drives
  • Standby voltage for systems-management features and monitoring
  • Startup (boot) from LAN through remote initial program load (RIPL) or dynamic host configuration protocol/boot protocol (DHCP/BOOTP)
  • System auto-configuring from the configuration menu
  • System-error logging (POST and IMM)
  • Systems-management monitoring through the Inter-Integrated Circuit (I2C) protocol bus
  • Uncorrectable error (UE) detection
  • Upgradeable UEFI, DSA, and IMM firmware, locally or over the LAN
  • Wake on LAN capability