7.4.2. Link Troubleshooting
7.4.2.1. Debugging Physical Link Issues
After you have run the proper commands and identified issues with links, it is useful to start root-causing the issues. This section focuses on the CN5000 Omni-Path Fabric physical links and not PCIe bus link issues.
The Omni-Path reporting tools are robust, but it can be confusing for new users to understand the difference between error counters and actual failures.
From an installation perspective, it is important to watch for physical issues with cabling, both copper and optical. In general, bend radius, cable insertion issues, and physical compression or damage to cables can result in transmission issues. The Omni-Path recovers from many issues transparently.
The following information can help you root-cause solid failures as well as marginal links. Most often the issue is resolved simply by re-installing a cable and verifying that it clicks into the connector socket on the SuperNIC or switch.
View the QSFP/cable details of a specific switch port using the command:
opasmaquery -o cableinfo -d 10 -l <lid> -m <switch portnumber>
To debug a particular switch, a useful technique is to get a snapshot of it using the command:
opareport -o snapshot -F portguid:value
Link issues may be the result of bad cables. Refer to the following sections for more information.
7.4.2.1.1. Omni-Path Link Transition Flow
To debug link issues, it is helpful to understand the four key link states, starting from Offline and running properly in the final Active state.
Note
The Fabric Manager, opafm, must be running to transition physical links from the Init state to the Active state. If you subsequently stop the Fabric Manager when a link is in the Active state, the link remains active. You can safely make changes to the opafm.xml file for the Fabric Manager and restart the service without dropping active links.
PortState:
Offline: Link down. QSFP not present or not visible to the SuperNIC driver.
Polling: Physical link training in progress. At this point you do not know whether the other end of the QSFP is connected to a working Omni-Path device.
Init: Link training has completed, both sides are present. Typically waiting for the Fabric Manager to enable the link.
Active: Normal operating state of a fully functional link.
7.4.2.1.2. Verify the Fabric Manager is Running
From the Management Node, run the following command to report all SuperNICs and Switches.
# opafabricinfo
If it fails, try the following steps:
Check the status of the Fabric Manager process using the command:
# systemctl status opafm
Restart the Fabric Manager using the command:
# systemctl start opafm
7.4.2.1.3. Check the State of SuperNIC Links from a Server
If you are debugging server link issues, the opainfo command may be useful for a single server view.
opainfo captures a variety of data useful for debugging server-related link issues. Multiple Omni-Path commands can be used to extract individual data elements, however, this command is unique in the combination of data it provides.
PortState: See Omni-Path Link Transition Flow.
LinkWidth: A fully functional link should indicate Act:4 and En:4.
QSFP: Physical cable information for the QSFP, in this case a 5M Optical (AOC) Finisar cable.
Link Quality: Range = 0 - 5 where 5 is Excellent.
# opainfo hfi1_0:1 PortGID:0xfe80000000000000:001175010165b19c PortState: Active LinkSpeed Act: 25Gb En: 25Gb LinkWidth Act: 4 En: 4 LinkWidthDnGrd ActTx: 4 Rx: 4 En: 3,4 LCRC Act: 14-bit En: 14-bit,16-bit,48-bit Mgmt: True LID: 0x00000001-0x00000001 SM LID: 0x00000002 SL: 0 QSFP: PassiveCu, 1m FCI Electronics P/N 10131941-2010LF Rev 5 Xmit Data: 22581581 MB Pkts: 5100825193 Recv Data: 18725619 MB Pkts: 4024569756 Link Quality: 5 (Excellent)
7.4.2.1.4. Link Width, Downgrades, and opafm.xml
By default, Omni-Path links run in x4 link width mode. Omni-Path has a highly robust link mechanism, as compared to InfiniBand, and it allows links to run in reduced widths with no data loss.
Three things to know:
By default, the
opafm.xmlconfiguration file requires links to start up in x4 link width mode. This is configurable separately for SuperNIC and ISL links using the WidthPolicy parameter.Link downgrade ranges are also configurable in the
opafm.xmlfile, using the MaxDroppedLanes parameter.Default configuration example - a link that successfully starts up in x4 width and subsequently downgrades to x3 width continues to operate. If the link is restarted, by a server reboot, for example, and attempts to run by less than x4 width, then the link is disabled by the Fabric Manager and does not enter the Active state.
The opainfo command for SuperNICs is useful for checking the link width and link downgrade configuration on servers.
For a system view of all links that are running in less than x4 width mode, use the command:
# opareport -o errors -o slowlinks
7.4.2.1.5. How to Check Fabric Connectivity
For large fabrics, check the following flow in the topology spreadsheet:
All host nodes should be defined as Type = FI in column F of the spreadsheet. All Edge switches (CN5000 Switches located on the edge of the network) should be defined as Type = SW in column L (destination from host to Edge) and column F (source for Edge to core that is also Edge switch). The following example shows links between host and Edge switch.
R19 opahost1 1 FI R19 opaedge1 13 SW opahost1_opae1p13 1m Cable CU
All links between Edge switch to core that is also an Edge switch should be defined as Type = SW, as shown in the following example:
row1 rack01 opaedge1 1 SW row1 rack04 opaedgecore1 2 SW opae1p1_opac1p2 5M Cable Fiber
All Director switches should be defined as Type = CL in column L (destination from Edge switch to Director switch). Column J (Name-2) should have the destination leaf and column K should have the port number on that leaf. The following example shows a link between an Edge switch to core that is a Director switch.
R19 opaedge1 5 SW R72 opadirector1 01 L105B 11 CL opae1p5opad1L105Bp11 30m Fiber
All Director Class Switches should be defined as shown in the following example:
Core Name:opadirector1 Core Group:row1 Core Rack:rack72 Core Size:1152 Core Full:0
Set Core Full to 0 if the Director switch is not fully populated with all the leafs and spines. If it is fully populated, set Core Full to 1.
7.4.2.1.6. Link Debug CLI Commands
Task | CLI Command |
|---|---|
Identify fabric errors. |
|
Identify slow links (< x4 width). |
|
Obtain LID of the switch. | Use |
If a link is not coming up as Active, first bounce the link, then check the link state. |
|
Get detailed link info for all nodes connected to a Switch (edge) or leaf and their neighbor. |
|
Find links that are not plugged in or not seen by the interface. Find all links stuck in the Offline state. |
|
Find all links stuck in the Polling state. NOTE: A link stuck in Polling may indicate that the other end of the cable is not inserted correctly. In this case, typically, one end is Polling and the other end is Offline. |
|
To bounce a link, simulate a cable pull and re-insert on a server. NOTE: It may take up to 60 seconds for the port to re-enter the active state. |
|
Check status of local SuperNIC ports. |
|
| Run the commands with the |