Skip to main content

Cornelis Technical Documentation

5.2.13. Fabric Manager Integrating Job Schedulers with Virtual Fabrics

Clusters deployed with multiple virtual fabrics may provide multiple QoS levels for compute jobs, security for separation of traffic, or QoS to separate compute from other fabric services such as storage and management. When a cluster is configured with multiple virtual fabrics, it is important to ensure that jobs are launched on the proper virtual fabric.

Many job schedulers provide integrated and automated mechanisms to assign, queue, and launch jobs within the proper virtual fabric. However, in some cases when launching jobs manually, or when using a job scheduler where the user must provide the final scripts for launching the job (such as scripts that invoke mpirun) it may be necessary to directly identify the virtual fabric.

In such cases, it can generally be assumed that the whole job will be run within a single virtual Fabric. This implies:

  • A single PKey will be used for all communications within the job.

  • A single SL or set of SLs will be used for all communications within the job.

  • A single MTU may be used for all communications within the job.

As a shortcut, in most cases, it can also be assumed that a single StaticRate, PktLifeTime, and other parameters may be acceptable for the whole job. However, depending on fabric status, this may not always be true.

Given these assumptions, the PKey, SL, and MTU for the desired virtual fabric can be determined and supplied to all ranks in the job. This is often done using parameters to mpirun.

5.2.13.1. Determining PKey, SL, and MTU for a vFabric

The PKey, SL, and MTU for the desired virtual fabric can be determined in a few ways:

  • If the virtual fabric configuration explicitly specified the PKey, SL, and MTU to be used, these values can be provided a priori to the job launch in a hardcoded manner. If no MTU was specified or an MTU of unlimited was specified, a value of 8192 can typically be used. Note that this approach is the most error-prone, but is sometimes the easiest way to get started.

  • The parameters of the virtual Fabric can be determined at job launch time via one of the Omni-Path tools. Various tools can provide the necessary information such as:

    • opareport –o vfinfo – This command provides a list of all active virtual fabrics. Its output can be easily scripted using the –x option and opaxmlextract. See the CN5000 Commands Guide, opareport for more information.

      opareport -o vfinfo
      Index: 0   Name: Networking  …..
      --------------------------------------------------------------------------
      Index: 1   Name: Default   …..
      --------------------------------------------------------------------------
      Index: 2   Name: Admin   …..
      --------------------------------------------------------------------------
      Index: 3   Name: Storage …..
      --------------------------------------------------------------------------
      Index: 4   Name: CheckPoint …..
      --------------------------------------------------------------------------
      Index: 5   Name: ComputeLow
      ServiceId: 0x0000000000000000  MGID: 0x0000000000000000:0x0000000000000000
      PKey: 0x3   SL: 0  Select: 0x3: PKEY SL   PktLifeTimeMult: 1
      MaxMtu:  8192  MaxRate: unlimited   Options: 0x03: Security QoS
      QOS: Bandwidth:  20%  PreemptionRank: 1  HoQLife:    8 ms
      --------------------------------------------------------------------------
      Index: 6   Name: ComputeHigh
      ServiceId: 0x0000000000000000  MGID: 0x0000000000000000:0x0000000000000000
      PKey: 0x6   SL: 1  Select: 0x3: PKEY SL   PktLifeTimeMult: 1
      MaxMtu:  8192  MaxRate: unlimited   Options: 0x03: Security QoS
      QOS: HighPriority  PreemptionRank: 2  HoQLife:    8 ms
      --------------------------------------------------------------------------