7.3.2. Software/Firmware Maintenance
This section describes the software and firmware maintenance including upgrades, updates, and installation of new modules.
7.3.2.1. Download the Firmware
Download the firmware or software using the following procedures.
Using a web browser, go to the Cornelis Customer Center.
Under Download Library, clear the navigation filters.
In the search box, enter your search string (for example, "firmware").
The results are displayed.
Select one or more items and click Download Selected.
Review the Software License Agreement(s) and click Accept for each item.
The firmware is saved to your computer.
7.3.2.2. Install the SuperNIC Firmware Update Tool
7.3.2.2.1. Prerequisites
The OPX Software (containing updateAgent dependencies) has been installed on the target server.
7.3.2.2.2. Procedure
To install the SuperNIC Firmware Update Tool, perform the following steps.
Download and extract the Firmware Update TGZ package (contains
updateAgent) from the Cornelis Customer Center.Copy the
updateAgentbinary to the root home (/root) on the target server containing the SuperNIC you want to update.
7.3.2.3. Update the SuperNIC Firmware
Use the SuperNIC Firmware Update Tool to update your SuperNIC firmware.
7.3.2.3.1. Prerequisites
The
updateAgentmust be copied to the server containing the SuperNIC you want to update.The hfi1 driver must be loaded before using the
updateAgent.If not, you will see errors like "Failed to open MAD port for HFI 0" and "Failed to build HFI list".
7.3.2.3.2. Procedure
Obtain the SuperNIC firmware (
CN5000_SuperNICFirmware-<version>.pkg) from the Cornelis Customer Center and copy the file to the server containing the SuperNIC you want to update.Verify the hfi1 driver is loaded and working correctly.
lsmod | grep hfi1should return resultsopainfoshould have entries for all SuperNICs in the systemIf needed, load the hfi1 driver using
modprobe hfi1, then recheckopainfo.
Check the current SuperNIC firmware version.
./updateAgent -V HFI hfi1_0 activeComponentImageSetVersionString: <current version>Note
You can use
./updateAgent -V -d allto display all of the SuperNICs on a server.If the Fabric Manager is running anywhere in the fabric, disable the SuperNIC port.
opaportconfig disable -h1 -p2
Alternately, you can stop the Fabric Manager.
systemctl stop opafm
Update the SuperNIC.
./updateAgent /path/to/firmware.pkg
Note
To update all SuperNICs on a server, you can use:
./updateAgent -d all /path/to/firmware.pkg
To update all SuperNICs in a fabric, you can use tools such a pdsh command as shown in the following example:
pdsh -w <hostfile> updateAgent -d all /path/to/firmware.pkg
Check the current SuperNIC firmware version and verify that the new version status is
pendingComponentImageSetVersionString../updateAgent -V HFI hfi1_0 activeComponentImageSetVersionString: <old version> HFI hfi1_0 pendingComponentImageSetVersionString:Power cycle the server.
Check the current SuperNIC firmware version again and verify that the status is
activeComponentImageSetVersionString../updateAgent -V HFI hfi1_0 activeComponentImageSetVersionString: <new version>If the firmware is still pending, power cycle the server using BMC.
If you stopped the Fabric Manager, restart it.
systemctl start opafm
7.3.2.4. Update the Switch Firmware
If you are updating both BMC and ASIC firmware, you must update the BMC firmware first.
Note
In the following instructions for the Pull Method,
user@hostnameimplies DNS is configured on the switch. If using static IP addresses, replace this text with the IP address of the switch containing the .pkg file.When using
firmware updateat the switch CLI to transfer (“pull”) firmware from a remote server, you’ll be prompted for the remote server password.When using SCP on a remote server to transfer (“push”) firmware to the switch, you’ll be prompted for the switch password.
It may take up to 20 minutes (for CN5000 Switch) or 40 minutes (for DCS) for a firmware update to complete.
Perform the following steps to update your switch firmware.
Note
If you are updating multiple switches, repeat these steps for each switch.
Download and extract the target Switch firmware package files (BMC Firmware and/or Switch Firmware) from the Cornelis Customer Center onto a server on the same Ethernet network as the switch to be updated.
Note
During the initial installation of this new version (after the forced reboot) or when inserting new boards, the BMCs may require multiple updates and reboots (up to 2) to synchronize the firmware across the entire DCS. After this initial synchronization, future updates should only require a single final reboot to apply. As long as the ASIC is off, this process should be automatic.
Run the
firmware updatecommand to begin the update process.Pull Method: If not already logged in, log into the switch using the admin account. Specify the
user@hostname:/path/to/file.pkgpath. Enter the password of the host when prompted.admin@CNEdge -> firmware update user@hostname:/path/to/ CN5000_BMCFirmware-<version>.pkg root@hostname's password: Copying firmware image to staging area... Firmware update started. Wait (up to 20 minutes), check status with "firmware update -s", and initiate a reboot when ready
Push Method: Specify the
admin@switchName:/tmp/imagesdestination path. Enter the switch password if/when prompted.[user@servername ~]# scp -O root/fw/CN5000_BMCFirmware-<version>.pkg admin@switchname:/tmp/images admin@switchname's password: Copying firmware image to staging area... Firmware update started. Wait (up to 20 minutes), check status with "firmware update -s", and initiate a ‘reboot force’ when ready
Run the
firmware updatecommand to begin the update process.Pull Method: If not already logged in, log into the switch using the admin account. Specify the
user@hostname:/path/to/file.pkgpath. Enter the password of the host when prompted.admin@CNEdge -> firmware update user@hostname:/path/to/ CN5000_SwitchFirmware-<version>.pkg root@hostname's password: Copying firmware image to staging area... Firmware update started. Wait (up to 20 minutes), check status with "firmware update -s", and initiate a 'reboot -f' when ready
Push Method: Specify the
admin@switchName:/tmp/imagesdestination path. Enter the switch password if/when prompted.[user@servername ~]# scp -O root/fw/CN5000_SwitchFirmware -<version>.pkg admin@switchname:/tmp/images admin@switchname's password: Copying firmware image to staging area... Firmware update started. Wait (up to 20 minutes), check status with "firmware update -s", and initiate a ‘reboot force’ when ready
Check the status of the update using
firmware update -s.admin@CNEdge -> firmware update -s BMC: Image 1: Booted and Active Image 2: Currently updating ASIC A: Image 1: Booted and Active Image 2: Currently updating
changes to
admin@CNEdge -> firmware update -s BMC: Image 1: Booted and Active Image 2: Staged for update ASIC A: Image 1: Booted and Active Image 2: Staged for update
When the firmware shows
Staged for update, the switch is ready for reboot.Reboot the switch.
admin@CNEdge -> reboot force Rebooting in 1 second(s)Lost Communication with server Connection to <hostname> closed by remote host. Connection to <hostname> closed.
Note
If you try to reboot BEFORE the firmware is in
Staged for update, you will receive the following error:Error during firmware update, rebooting now could be dangerous. Are you sure you wish to continue?
Type no and wait for the status to change.
After updating the switch firmware, check the versions.
admin@CNEdge -> firmware version Firmware Versions: Switch BMC Chip version: <current version> ASIC Chip A version: <current version>
7.3.2.6. Remove OPX Software from a Host
If you need to remove the OPX Software from a host due to an issue with an OS upgrade, perform the following steps.
Uninstall the software using the command.
Using RHEL:
sudo dnf remove <META_PACKAGE_NAME>
Using SLES:
sudo zypper remove <META_PACKAGE_NAME>
Using Ubuntu:
sudo apt remove <META_PACKAGE_NAME>
Remove kernel packages.
Using RHEL:
sudo dnf remove kmod-opxs-kernel-updates opxs-kernel-updates-devel
Using SLES:
sudo zypper remove kmod-opxs-kernel-updates opxs-kernel-updates-devel
Using Ubuntu:
sudo apt remove opxs-modules-dkms*
Remove residual packages.
Using RHEL:
sudo dnf remove opa-* sudo dnf remove libfabric-* sudo dnf remove libpsm2-*
Using SLES:
sudo zypper remove opa-* sudo zypper remove libfabric-* sudo zypper remove libpsm2-*
Using Ubuntu:
sudo apt remove opa-* sudo apt remove libfabric-*
Reboot.