GPU performance problems
In the event of high temperatures, the GPUs will self-throttle, which can cause a reduction in performance. Under normal operation this should never occur because the XCC actively monitors GPU temperatures and adjusts system fans accordingly.
A loss of power.
A Power Supply Throttle assertion (typically encountered if a power supply is too hot).
Inlet temperature exceeds supported ASHRAE specification (e.g. 35°C for ASHRAE A2).
Inlet temperate exceeds 27°C in combination with fan failure.
To monitor if any of these scenarios of occurred, check the System Error LED and the XClarity Controller event log for errors related to redundancy, a degraded state, or a PCIe Power Brake.
Make sure that two 2000W power supplies are installed, powered, and operational (no errors).
Check the XClarity Controller event log for events related to fan failures. If errors occur, replacing the failing fan.
Check the ambient temperature of the datacenter where the server is installed.
Check the PCIe power brake mode.