Skip to main content

Install a GPU (trained technician only)

Use this information to install a GPU.

About this task

Important
Gap pad/putty pad replacement guidelines
  • To identify the gap pad/putty pad location and orientation, see:

  • Before replacing the gap pad/putty pad, gently clean the interface plate or the hardware surface with an alcohol cleaning pad.

  • Hold the gap pad/putty pad carefully to avoid deformation. Make sure no screw hole or opening is blocked by the gap pad/putty pad material.

  • Do not use expired putty pad. Check the expiry date on putty pad package. If the putty pads are expired, acquire new ones to properly replace them.

Required tools

Make sure you have the required tools listed below in hand to properly replace the component.

  • SD665-N V3 Water Loop Service Kit (The water loop carrier in the Service Kit is reusable, it is recommended to keep it at the facility where the server operates for future replacement needs.)

  • SD665-N V3 Water Loop Putty Pad Kit

  • SD665-N V3 SXM5 PCM Fixture

  • SXM5 PCM Kit (for removing PCM from GPU)

    Putty pad cannot be reused. Whenever the water loop is removed, putty pads must be replaced with new ones before reinstalling the water loop.

  • Screws and screwdrivers

    Prepare the following screwdrivers to ensure you can install and remove corresponding screws properly.
    Screwdriver TypeScrew Type
    Hex screwdriver6 mm hex head screwdriver
    Torx T10 head screwdriverTorx T10 screw
    Phillips #1 head screwdriverPhillips #1 screw
    Phillips #2 head screwdriverPhillips #2 screw
Attention
  • Read Installation Guidelines and Safety inspection checklist to ensure that you work safely.

  • Turn off the corresponding DWC tray that you are going to perform the task on.

  • Disconnect all external cables from the enclosure.

  • Use extra force to disconnect QSFP cables if they are connected to the solution.

  • To avoid damaging the water loop, always use the water loop carrier when removing, installing or folding the water loop.

  • A torque screwdriver is available for request if you do not have one at hand.

The following illustration shows the GPU numbering.
Figure 1. GPU numbering
GPU numbering
Firmware and driver download: You might need to update the firmware or driver after replacing a component.

Procedure

Note
Make sure to inspect the connectors and sockets on the GPU and the GPU board. Do not use the GPU or the GPU board if its connectors are damaged or missing, or if there are debris in the sockets. Replace the GPU or the GPU board with a new one before continuing the installation procedure.

  1. Gently place the GPU down on the GPU board; then, install the four Torx T15 screws with a torque screwdriver set to the proper torque.
    Note
    For reference, the torque required for the screws to be fully tightened/removed is 0.45-0.56 N-m, 4.0-5.0 in-lbf.
    Figure 2. GPU installation

    Figure 3. GPU screw tightening sequence
    GPU screw tightening sequence

Make sure to follow Gap pad/putty pad replacement guidelines.

  1. Replace the Phase Change Material (PCM) and putty pads on the GPU node water loop with new ones.
    1. Install the PCM jig on the GPU cold plate.
    2. Attach the PCM to the square opening of the jig.
    3. Repeat to replace the PCM of all four GPU cold plates.
      Attention
      • PCM cannot be reused. PCM must be replaced with new ones every time the water loop is removed.

      • After PCM is replaced, there is an expected short duration of throttling before the GPU returns to normal operation. This is due to the PCM requiring a break-in period after being replaced.

      Figure 4. Water loop GPU cold plate PCM replacement (GPU node)
      Water loop GPU cold plate PCM replacement
    4. Replace the putty pads on the GPU node water loop.
      Attention
      Putty pad cannot be reused. Whenever the water loop is removed, putty pads must be replaced with new ones before reinstalling the water loop.
      Figure 5. Water loop putty pads replacement (GPU node)
      Putty pads on the GPU node water loop putty pads replacement
    5. Replace the putty pads (x5) on the GPU. Make sure to align the putty pads to the GPU VR (1) and the markings on GPU. Repeat to replace all putty pads on the four GPUs.
      1 GPU VR (Cover the GPU VR with putty pad)
      Attention
      Putty pad cannot be reused. Whenever the water loop is removed, putty pads must be replaced with new ones before reinstalling the water loop.
      Figure 6. GPU putty pads replacement
      GPU putty pads replacement

Make sure to follow Gap pad/putty pad replacement guidelines.

  1. Unfold the water loop and place it onto the GPU node.
    Figure 7. Unfolding the water loop to GPU node
    Unfolding the water loop to GPU node
  2. Loosen water loop carrier screws (x20 Phillips #2 screws).
    Figure 8. Water loop screws and quick connect screws installation (GPU node)
    Water loop screws and quick connect screws installation (GPU node)
  3. Remove the water loop carrier from the GPU node.
    Figure 9. Water loop carrier removal (GPU node)
    Water loop carrier removal (GPU node)
  4. Install GPU cold plate screws (x16 T10 screws).
    1. Install GPU cold plate screws following the GPU cold plates installation sequence: GPU 2 > GPU 4 > GPU 1 > GPU 3
      Figure 10. GPU numbering
      GPU numbering
    2. Set the torque screwdriver to 1.5 +/- 0.5 lb-In (0.1 +/- 0.06 N-m); then, fasten the GPU cold plate screws.
    3. Set the torque screwdriver to 3.5 +/- 0.5 lb-In (0.4 +/- 0.06 N-m); then, fasten the GPU cold plate screws until all screws are fully tightened.
      Note
      Make sure to follow the screw sequence on the GPU cold plate label:
      Figure 11. GPU cold plate screw installation
      GPU cold plate screw installation
  5. Follow the screw installation sequence specified on the network board label, and install network cold plate screws (x8 Torx T10 screws) with a torque screwdriver set to the proper torque.
    Note
    For reference, the torque required for the screws to be fully tightened/removed is 5.0+/- 0.5 lbf-in, 0.55+/- 0.05 N-M.
    Figure 12. Network card screw installation
    Network card screw installation
  6. Install the quick connect screws (x4 Torx T10) with a torque screwdriver set to the proper torque.
    Note
    For reference, the torque required for the screws to be fully tightened/removed is 5.0+/- 0.5 lbf-in, 0.55+/- 0.05 N-M.
    Figure 13. Quick connect screw installation (GPU node)
    Quick connect screw installation (GPU node)
  7. Install water loop screws and quick connect screws (x13 Torx T10 screws) with a torque screwdriver set to the proper torque.
    Note
    For reference, the torque required for the screws to be fully tightened/removed is 5.0+/- 0.5 lbf-in, 0.55+/- 0.05 N-M.
    Figure 14. Water loop Torx T10 screws installation (GPU node)
    Water loop Torx T10 screws installation (GPU node)
  8. Install the Hex screw (x1) and the PH1 screws (x3).
    Note
    For reference, the torque required for the screws to be fully tightened/removed is 5.0+/- 0.5 lbf-in, 0.55+/- 0.05 N-M.
    Figure 15. Water loop Hex and PH1 screws installation (GPU node)
    Water loop Hex and PH1 screws installation (GPU node)
  9. Install the cable tie to the GPU board.
    Figure 16. Installing the cable tie
    Installing the cable tie
  10. Connect the carrier board power cable.
    Figure 17. Connecting carrier board power cable
    Connecting carrier board power cable
    CableFrom (carrier board)To (GPU node power distribution board)
    1 Carrier board power cablePower and side band connectorPower connector
After you finish
  1. Install the MCIO cables. Follow the guidance and routing information in Internal cable routing.

  2. Install the bus bar. See Install the bus bar.

  3. Install the cross braces. See Install the cross braces.

  4. Install the tray cover. See Install the tray cover.

  5. Install the tray into the enclosure. See Install a DWC tray in the enclosure.

  6. Connect all required external cables to the solution.
    Note
    Use extra force to connect QSFP cables to the solution.
  7. Check the power LED on each node to make sure it changes from fast blink to slow blink to indicate all nodes are ready to be powered on.

Demo video

Watch the procedure on YouTube