Skip to main content

Cornelis Technical Documentation

5.2.11. Multiple Virtual Fabrics

The Subnet Manager supports an environment in which the SM administrator can assign Virtual Fabrics and Device Groups to individual tenants of a fabric. For this feature, the SM supports up to a maximum of 1000 Virtual Fabrics and 1000 Device Groups. However, since each SL must map to a unique VL, this imposes limits on the number of QoS Virtual Fabrics based on how many VLs the hardware supports. In order to set up larger numbers of QoS Virtual Fabrics, they must be configured in such a way that they share VLs. That is to say, multiple Virtual Fabrics will share the same QoS policies. To simplify configuring up to 1000 QoS Virtual Fabrics that would be sharing QoS settings, the configuration file supports defining a set of QoS Groups. Each QoS Group will define one of the various levels of QoS that the fabric will support. This includes setting the SL and bandwidth that will be shared by all Virtual Fabrics of this group. The user then associates each Virtual Fabric with one of the configured QoS Groups.

Note

For current Cornelis hardware, which supports up to eight VLs, there can be no more than eight QoS Groups.

If you want to secure your CN5000 Omni-Path Fabric, refer to Virtual Fabrics for Multi-Tenancy.

5.2.11.1. Adding and Removing vFabrics

All VirtualFabrics must specify the name of an enabled QOSGroup and a PKey. For example:

<QOSGroups>
    <QOSGroup>
       <Name>Storage</Name>
       <Enable>1</Enable>
       <Bandwidth>20%</Bandwidth>
       <PreemptRank>1</PreemptRank>
    </QOSGroup>
</QOSGroups>

<VirtualFabrics>
    <VirtualFabric>
       <Name>Storage</Name>
       <Enable>1</Enable>
       <Security>1</Security>
       <PKey>0x0004</PKey>
       <LimitedMember>All</LimitedMember>
       <Application>Storage</Application>
       <QOSGroup>Storage</QOSGroup>
    </VirtualFabric>
</VirtualFabrics>

A VirtualFabric can be added to the configuration file dynamically without restarting the Subnet Manager. After adding the vFabric, which has been specified to join an existing QOSGroup, the configuration can be reloaded with systemctl reload opafm. The new vFabric will now be active. In the same way, a VirtualFabric definition can be disabled or deleted entirely from the configuration file without restarting the Subnet Manager. If no other VirtualFabric shared a PKey, then all members of the removed vFabric will have its PKey removed from their PKey table.

If a QOSGroup is enabled, but there are no currently active vFabrics associated with it, the QOSGroup will continue to exist, but its SLs will receive 0% Bandwidth. This is to allow vFabrics in the future to be added to them without altering the fabric's QoS policies.

Along with adding new VirtualFabrics dynamically, DeviceGroups can be added to the configuration file. This allows each tenant to have access to a set group of nodes. In order to delete a DeviceGroup, the group must not be in use by any Enabled VirtualFabric, PmPortGroup, or routing routine. Attempting to remove a DeviceGroup that is in use will result in a reconfiguration error and the Subnet Manager will continue operating with the last valid configuration.

In configurations using multiple virtual fabrics, it is recommended that no VirtualFabric use the AllOthers Application. That is to say, no Application of an Active VirtualFabric should have the UnmatchedMGID or UnmatchedServiceID Selects. These select flags will associate all MGIDs or ServiceIDs that do not match any other VirtualFabric to the one containing AllOthers. However, if a VirtualFabric is dynamically added to the configuration, any MGID or ServiceID of its Applications is now removed from the VirtualFabric that contained AllOthers. This can result in disruption of an active VirtualFabric.

5.2.11.2. Modifying the vFabric Configuration File for Multiple vFabrics

When setting up all QOSGroups that will run on the fabric, they should be configured in the vFabric configuration file (opafm_pp.xml). For example, the XML file may contain a QOSGroup named Compute. Then, the following file may exist at /etc/opa-fm/vfs/compute:

<VirtualFabric>
   <Name>Compute</Name>
   <Enable>1</Enable>
   <Security>1</Security>
   <PKey>0x0003</PKey>
   <Member>All</Member>
   <Application>Compute</Application>
   <QOSGroup>Compute</QOSGroup>
</VirtualFabric>

To merge these individual vFabric and DeviceGroup files into the XML, the opafmconfigpp tool can be used. By default, it will parse /etc/opa-fm/opafm_pp.xml. When it reaches the INCLUDE comment lines, it will replace those comments with all files found in the specified directories. It then runs the config_check tool against the newly created configuration. If it is valid, the XML and merged vFabrics and DeviceGroups get copied to the Fabric Managers default location, /etc/opa-fm/opafm.xml.

When adding or removing virtual fabrics to an active fabric, a new file should be added to /etc/opa-fm/vfs, then rerun opafmconfigpp . If it is valid, reload the Subnet Manager with systemctl reload opafm. If there are redundant Subnet Managers in the fabric, copy the merged /etc/opa-fm/opafm.xml to the Standby hosts before reloading the Primary Fabric Manager.

5.2.11.3. Multiple vFabrics with the Same PKey

When multiple vFabrics are specified with the same PKey, they share a single PKey. When this occurs, the security for the vFabrics is the logical “OR” of the security for the two. If security is off in both, there are no limited members (only full members). If security is on for either (or both), security is imposed for both.

When two vFabrics share the same PKey, the list of members is the combined list from both vFabrics. Members is the sum of members in both, and LimitedMembers is the sum of limited members in both.

5.2.11.4. Sharing SLs Between Multiple vFabrics

It is possible to specify multiple QoS vFabrics to share SLs. The vFabrics sharing SLs must be configured with the same QoS settings (HighPriority, PreemptRank, FlowControlDisable, HoqLife, PktLifeTimeMult). Multicast isolation can be achieved within a vFabric by specifying a MulticastSL that differs from its BaseSL. If an SL is used for multicast isolation by one vFabric, it may not be specified as the BaseSL of another vFabric. It may, however, be used as another vFabric’s MulticastSL. If the user does not specify SLs for a QoS vFabric, it will be assigned a unique BaseSL.

If a unique MulticastSL is specified for a vFabric, the bandwidth assigned to the vFabric will be split evenly between the BaseSL and the MulticastSL. A vFabric may be configured to have 0% bandwidth if it shares SLs with another vFabric whose bandwidth is not zero.

vFabrics without QoS settings may not specify SLs or bandwidth. They will all share the same BaseSL and one share of the unallocated bandwidth.