Skip to main content

Install a GPU OAM

Use this information to install a GPU OAM. This procedure is trained technician only.

About this task

Required tools

Make sure you have the required tools listed below in hand to properly replace the component.

  • Water loop kits

    • SD650-I V3 Water loop service kit (03KH870)

    • SD650-I V3 Water loop putty pad kit (03LD670)

  • Screws and screwdrivers

    Prepare the following screwdrivers to ensure you can install and remove corresponding screws properly.
    Screwdriver TypeScrew Type
    Torx T10 head screwdriverTorx T10 screw
    Phillips #1 head screwdriverPhillips #1 screw
    Phillips #2 head screwdriverPhillips #2 screw
Important
Gap pad/putty pad replacement guidelines
  • To identify the gap pad/putty pad location and orientation, see Gap pad and putty pad identification and location.

  • Before replacing the gap pad/putty pad, gently clean the interface plate or the hardware surface with an alcohol cleaning pad.

  • Hold the gap pad/putty pad carefully to avoid deformation. Make sure no screw hole or opening is blocked by the gap pad/putty pad material.

  • Do not use expired putty pad. Check the expiry date on putty pad package. If the putty pads are expired, acquire new ones to properly replace them.

Important
Note
To prevent potential thermal issues, change the Misc setting in the BIOS from Option3 (default value) to Option1 if the following two conditions are met:
  • The server is equipped with a GPU adapter.

  • The UEFI firmware version is USE126F or later.

For the method of changing the Misc setting, see https://support.lenovo.com/us/en/solutions/TT1832.
Attention
  • Read Installation Guidelines and Safety inspection checklist to ensure that you work safely.

  • Turn off the corresponding DWC tray that you are going to perform the task on.

  • Disconnect all external cables from the enclosure.

  • Use extra force to disconnect QSFP cables if they are connected to the solution.

  • To avoid damaging the water loop, always use the water loop carrier when removing, installing or folding the water loop.

  • After updating XCC firmware, perform virtual reseat via SMM2 to optimize system, see SMM2 User Guide.

The following illustration shows the GPU OAM numbering.
Figure 1. GPU OAM numbering
GPU OAM numbering
Firmware and driver download: You might need to update the firmware or driver after replacing a component.

Procedure

  1. Gently place the GPU OAM down on the carrier base board (CBB); then, install the four Torx T15 screws with a torque screwdriver set to the proper torque.
    Attention
    Follow the three-step GPU OAM installation method:
    1. When tightening the screws, follow the sequence shown in the picture below.

    2. First, set the torque screwdriver to 0.0981 N-M (0.868 lbf.in) to slightly tighten the screws.

    3. Second, set the torque screwdriver to 0.8829 N-M (7.8 lbf.in) to fully tighten the screws.

    4. Lastly, set the torque screwdriver to 0.8829 N-M (7.8 lbf.in), and fasten each screw to ensure that all screws are fully tightened.

    Figure 2. Screw tightening sequence for GPU OAM installation
    Screw tightening sequence for GPU OAM installation
  2. If there is any old thermal grease on four GPU OAMs and the cold plates, gently clean the top of the four GPU OAMs and the cold plates using an alcohol cleaning pad.
  3. Apply the thermal grease on the top of the four GPU OAMs with syringe by forming four dots spaced as shown, while each dot consists of about 0.15 ml of thermal grease.
    Figure 3. Thermal grease application
    Thermal grease application
  4. Check the gap pads on the water loop, if any of them are damaged or detached, replace them with the new ones.
    Figure 4. Water loop gap pads
    Water loop gap pads

Make sure to follow Gap pad/putty pad replacement guidelines.

  1. Replace the putty pads on the water loop with new ones.
    Note
    When attaching the putty pads on GPU cold plate, align the putty pads with the markings on the GPU cold plate.
    Figure 5. Putty pad locations
    Putty pad locations

Make sure to follow Gap pad/putty pad replacement guidelines.

  1. Unfold and install the water loop as shown.
    Figure 6. Water loop installation
    Water loop installation
  2. Loosen water loop carrier screws (19x Phillips #2 screws).
    Figure 7. Loosening water loop carrier screws
    Loosening water loop carrier screws
  3. Carefully lift the water loop carrier up and away from the water loop.
    Figure 8. Water loop carrier removal
    Water loop carrier removal
  4. Install water loop screws (14x Torx T10 screws) with a torque screwdriver set to the proper torque.
    Note
    For reference, the torque required for the screws to be fully tightened/removed is 5.0+/- 0.5 lbf-in, 0.55+/- 0.05 N-M.
    Figure 9. Water loop screws installation
    Water loop screws installation
  5. Install the following screws to secure the quick connect.
    • Two Torx T10 screws to secure the quick connect.

    • Four Torx T10 screws on the rear of the node.

    Figure 10. Quick connect screw installation
    Quick connect screw installation
  6. Install the GPU OAM cold plate screws (16x Torx 15 screws).
    Figure 11. GPU OAM cold plate screw installation
    GPU OAM cold plate screw installation
    1. Push down the GPU OAM cold plate with your palm to reduce the gap between the GPU OAM cold plate and the GPU OAM.
    2. Press the torque screwdriver against the screw so that the screw is engaged with the GPU OAM.
    3. Follow the screw sequence specified on the GPU OAM cold plate label, and fasten each screw for 720 degrees with a torque screwdriver set to the proper torque and rpm.
      Note
      For reference, the torque required for the screws to be fully tightened/removed is 0.9 +/-0.06 newton-meters, 8+/- 0.5 inch-pounds. The rpm setting is 200 rpm low-speed.
      Figure 12. Fastening GPU OAM cold plate screws for 720 degrees
      Fastening GPU OAM cold plate screws for 720 degrees
    4. Make sure that the GPU OAM cold plate is lowered into the node and its surface is flat without tilting. If the GPU OAM cold plate is tilted, unfasten the screws, and repeat Step 1 to Step 3.
    5. Repeat Step 3 until the screws are fully tightened.
    6. Make sure the height of each screw is 11.5±0.3 millimeter (0.45±0.01 inch) and is fully compressed. If not, repeat the GPU OAM cold plate installation steps.
      Figure 13. Height of properly installed GPU OAM cold plate screw
      Height of properly installed GPU OAM cold plate screw
After you finish
  1. Connect and route the cables in the tray. See GPU node cable routing.

  2. Install the cross braces. See Install the cross braces.

  3. Install the tray cover. See Install the tray cover.

  4. Install the tray into the enclosure. See Install a DWC tray in the enclosure.

  5. Connect all required external cables to the solution.
    Note
    Use extra force to connect QSFP cables to the solution.
  6. Check the power LED on each node to make sure it changes from fast blink to slow blink to indicate all nodes are ready to be powered on.

  7. After installing the GPU OAM, complete the following steps for proper GPU OAM installation. (Trained technician ONLY)
    • Update AMC and IFWI firmware to the latest version.

      1. Check the AMC and IFWI firmware version of the newly installed GPU OAM. See AMC firmware version in XCC Web GUI and see IFWI firmware version via Intel® XPU Manager.

      2. Latest AMC and IFWI firmware can be found at Drivers and Software download website for ThinkSystem SD650-I V3. If the AMC and IFWI firmware version of the GPU OAM is not the latest version, proceed to the next step.

      3. Update AMC and IFWI firmware:

        • Update AMC firmware via XCC Web GUI. Or via OneCLI with the following command, where FW_FILE_NAME is the AMC firmware file name. Make sure to put the FW_FILE_NAME under the path /flash/ and the files must include .zip and .json files.
          OneCli update flash --forceid FW_FILE_NAME --checkdevice --dir /flash/ --output /flash/result

          After updating AMC firmware, perform virtual reseat, see SMM2 User Guide.

        • Update IFWI firmware via Intel® XPU Manager.