Remove a front H100/H200 GPU
Follow instructions in this section to remove a front H100/H200 GPU. The procedure must be executed by a trained technician.
About this task
Attention
- Read Installation Guidelines and Safety inspection checklist to ensure that you work safely.
- Power off the server and peripheral devices and disconnect the power cords and all external cables. See Power off the server.
- If the server is installed in a rack, slide the server out on its rack slide rails to gain access to the top cover, or remove the chassis from the rack. See Remove the server from rack.
- Two people and one lifting device on site that can support up to 400 lb (181 kg) are required to perform this procedure. If you do not already have a lifting device available, Lenovo offers the Genie Lift GL-8 material lift that can be purchased at Data Center Solution Configurator. Make sure to include the Foot-release brake and the Load Platform when ordering the Genie Lift GL-8 material lift.
- A torque screwdriver is available for request if you do not have one at hand.
Note
Make sure you have the required tools listed below available to properly replace the component:
- Torx T10 head screwdriver
- Torx T15 head screwdriver
- Phillips #1 head screwdriver
- Phillips #2 head screwdriver
- Flat head screwdriver
- Alcohol cleaning pad
- H100/H200 PCM Kit
- SR780a V3 water loop putty pad kit
- SR780a V3 water loop service kit
- H100/H200 GPU service fixture kit
Important
Putty pad/phase change material (PCM) replacement guidelines
- Before replacing the putty pad/PCM, gently clean the hardware surface with an alcohol cleaning pad.
- Hold the putty pad/PCM carefully to avoid deformation. Make sure no screw hole or opening is blocked by the putty pad/PCM.
- Do not use expired putty pad/PCM. Check the expiry date on putty pad/PCM package. If the putty pads/PCM are expired, acquire new ones to properly replace them.
The following illustration shows the GPU numbering and corresponding slot numbering in XCC.
Figure 1. GPU numbering


Physical GPU socket | Slot numbering in XCC | Logical number in nvidia-smi |
---|---|---|
GPU 1 | Slot 21 | 1 |
GPU 2 | Slot 24 | 2 |
GPU 3 | Slot 22 | 0 |
GPU 4 | Slot 23 | 3 |
GPU 5 | Slot 17 | 5 |
GPU 6 | Slot 20 | 6 |
GPU 7 | Slot 18 | 4 |
GPU 8 | Slot 19 | 7 |
Procedure
After you finish
- Install a replacement unit. See Install a front H100/H200 GPU.
- If you are instructed to return the component or optional device, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you.
Give documentation feedback