Skip to main content

Cornelis Technical Documentation

7.3.1.2. Offline Maintenance

This section describes the hardware maintenance that must be performed when the Omni-Path hardware is offline.

7.3.1.2.1. DCS Chassis Replacement
7.3.1.2.1.1. Prerequisites
  • Ensure two or more people are available to lift the DCS when required.

  • Ensure that all cables are labeled properly prior to removal.

7.3.1.2.1.2. Procedure
  1. Power down the chassis using power button.

    DCS_PS_modules_power_button.png
  2. Disconnect all power and Ethernet wires.

  3. Remove all ISL cables.

  4. Remove all modules: leafs, spines, power supplies, and fans.

    Note

    Removing the modules reduces the weight that will make replacement easier.

  5. Unscrew the chassis from the rails/racks.

  6. Slide the chassis onto the lift platform.

  7. Replace the chassis and insert the modules and power cables.

  8. Check the firmware and update if needed.

  9. Plug in all ISLs.

7.3.1.2.2. Switch Replacement
7.3.1.2.2.1. Prerequisites
  • Ensure that all cables are labeled properly prior to removal.

7.3.1.2.2.2. Procedure
  1. Disconnect the power and Ethernet cables.

  2. Remove all cables.

  3. Unscrew the CN5000 Switch from the rails/racks.

  4. Slide the CN5000 Switch onto a lab cart.

  5. Replace CN5000 Switch.

  6. Insert the cables: Ethernet, Omni-Path, and power.

  7. Check the firmware and update as needed.

7.3.1.2.3. Remove a Liquid-Cooled Switch
7.3.1.2.3.1. Prerequisites

Ensure the Switch has been shut down through the cluster management framework. Allow the Switch to reach a safe internal temperature (passive cool-down).

7.3.1.2.3.2. Procedure
  1. Isolate and depressurize the cooling loop at the node.

    1. Close or isolate the branch of the cooling manifold feeding the node (if supported by rack/CDU design).

    2. Relieve local hydrostatic pressure at the quick-disconnect interface per service instructions.

      Note

      Ensure hoses are safely supported to avoid strain on the QDs.

  2. Disconnect the quick-disconnect couplers by depressing the quick-disconnect release mechanism and separate the male/female halves. Switch-side fittings will automatically seal when disconnected.

  3. To drain the Switch, remove both quick-disconnects from the switch. Alternatively, attach a hose to the quick-disconnects where the hose is open to the atmosphere.

  4. Remove the hose(s) if necessary, and cap all liquid ports.

  5. Remove the Switch (refer to Switch Replacement).

  6. Store or prepare the Switch for transport per Cornelis packaging requirements.

    Note

    Package the switch to prevent physical damage. Insure for full replacement value.

7.3.1.2.4. SuperNIC Replacement
  1. Power down the server.

  2. Remove all cables.

  3. Slide the server out.

  4. Remove the cover from the server.

  5. Remove the PCIe-Riser adapter from the PCIe slot, if applicable.

  6. Remove the old SuperNIC from the slot.

  7. Insert the new SuperNIC.

  8. Reinsert the PCIe Riser adapter into PCIe slot.

  9. Replace the cover on the server.

  10. Push the server back into the rack.

  11. Reinsert all cables.

  12. Power on the server.