Skip to main content

Cornelis Technical Documentation

6.1.1. CN5000 Fabric Performance Tuning Quick Start

The sections below summarize tunings for CN5000 performance, separated by MPI/OPX, Verbs, and IPoIB. This is only a rough guide, and individual clusters may require other tunings, discussed in other sections of this guide.

6.1.1.1. Highest Priority Tunings

  • Review and apply BIOS Settings and Linux Settings.

  • Enable processor turbo mode, if possible.

  • Enable “Performance Governor” with either ACPI or Intel P-State frequency driver:

    > cpupower -c all frequency-set -g performance

6.1.1.2. MPI Using the OPX Provider

  • Make sure the MPI is using libfabric (OFI) with the OPX Provider. See Intel MPI Library Settings.

  • Use the latest available Intel MPI Library for optimized application performance. In some cases, Open MPI may perform better and is application dependent.

  • Improved bandwidth is available by setting FI_OPX_HFISVC=1 in the MPI job.  

    Note this also requires loading the hfi1 driver with the module parameter use_bulksvc=Y. For details, see HFI1 Driver Parameters.

6.1.1.3. Verbs

Improved bandwidth is available by loading the hfi1 driver with the module parameter use_bulksvc=Y. When using verbs, no additional flags are necessary. For details, see HFI1 Driver Parameters and Verbs.

6.1.1.4. IPoIB

  • Cornelis recommends using Connected Mode to achieve the highest single-threaded performance. The Connected Mode MTU size can be adjusted to 65520 bytes to achieve better bandwidth.

  • For the best multi-threaded bandwidth scaling, Datagram Mode should be used. For more information, see IPoIB Interface Configuration