Skip to main content

Cornelis Technical Documentation

7.2.5.2. Performance Manager Parameters

The following tables describe opafm.xml parameters that can be used in the Pm subsection of either the Common or Fm sections.

Note

Any parameter that can be used in the Common.Shared section can also be used in the Common.Pm or Fm.Pm sections.

7.2.5.2.1. PM Controls

The parameters shown in the following table set up the options for when and how the PM monitors the fabric.

Table 29. PM Parameters

Parameter

Default Value

Description

ServiceLease

60

ServiceRecord lease with SA in seconds.

SweepInterval

10

The PM constantly sweeps and computes fabric statistics. If the SweepInterval is set to 0, the PM will not perform sweeps. But instead if queried it will do an immediate PMA operation.

Tools such as opatop require PM SweepInterval be non-zero.

The default in the sample FM configuration file is ten seconds. However when upgrading from previous FM releases, if an FM configuration file is used without this value specified, a default of 0 is used. This permits upgrades to operate in a mode comparable to the existing configuration and requires specific user action to enable the new 6.0 and above PM features.

MaxClients

3

The maximum number of concurrent PA client applications (for example, opareport, opatop, oparfm) running against the same PM/PA.

TotalImages

FreezeFrameImages

10

5

The PM can retain recent fabric topology and performance data. Each such dataset is referred to as an Image. Images allow for access to recent history and/or Freeze Frame by clients. Each image consumes memory, so care must be taken not to take an excessive amount of memory, especially for larger fabrics.

TotalImages - total images for history and freeze

FreezeFrameImages - max unique frozen images

FreezeFrameLease

60

If a PA client application hangs or dies, after this set time, all its frozen images will be released.

Specified in seconds



7.2.5.2.2. PA Category Parameters
PM Thresholds

PM Threshold parameters are set for each PM category. Exceeding the values for each category will result in a log warning. You can set a PM threshold value to 0 to ignore the given class of errors.

The parameters shown in the following table set the thresholds for each category and exceeding these values for each category will print a log warning. 0 causes the given class of errors to be ignored.

Table 30. Threshold Parameters

Parameter

Default Value

Description

Integrity

100

Threshold for logging a warning indicating a possible error condition.

Congestion

100

Threshold for logging a warning indicating a possible error condition.

SmaCongestion

100

Threshold for logging a warning indicating a possible error condition.

Bubble

100

Threshold for logging a warning indicating a possible error condition.

Security

10

Threshold for logging a warning indicating a possible error condition.

Routing

100

Threshold for logging a warning indicating a possible error condition.



  • TrapThreshold

    TrapThreshold is configured to monitor the number of traps per minute for a port. If a given port exceeds the threshold value, the port is disabled as unstable.

    You can set the TrapThreshold to 0 to disable this feature.

  • TrapThresholdMinCount

    TrapThresholdMinCount defines the minimum number of traps required to reach the TrapThreshold rate. For example, if TrapThreshold is set to 10 traps per minute and TrapThresholdMinCount is set to 5, the port is disabled after 5 traps are received at the rate of 10 traps per minute (that is, 5 traps in 30 seconds).

    The TrapThresholdMinCount parameter value must be greater than 2; the default value is 10.

    If you set the TrapThreshold to 0, the TrapThresholdMinCount parameter is ignored.

    Larger values of TrapThresholdMinCount increase the accuracy of detecting the trap rate, but also increase the time between a trap surge and the SM disabling a port.

    Very small values of TrapThresholdMinCount can result in a port being disabled after a few traps.

Threshold Exceeded Message Limit

PM Threshold Exceeded Message Limit parameters limit how many ports that exceed their PM thresholds are logged per sweep. These parameters can help avoid excessive log messages when extreme fabric problems occur.

The parameters shown in the following table limit how many ports which exceed their PM Thresholds are logged per sweep. These can avoid excessive log messages when extreme fabric problems occur. These parameters can be used in the ThresholdsExceededMsgLimit section.

Table 31. PM ThresholdsExceededMsgLimit Parameters

Parameter

Default Value

Description

Integrity

10

Maximum ports per PM Sweep to log if exceeds configured threshold. A value of 0 suppresses logging of this class of threshold exceeded errors.

Congestion

0

Maximum ports per PM Sweep to log if exceeds configured threshold. A value of 0 suppresses logging of this class of threshold exceeded errors.

SmaCongestion

0

Maximum ports per PM Sweep to log if exceeds configured threshold. A value of 0 suppresses logging of this class of threshold exceeded errors.

Bubble

0

Maximum ports per PM Sweep to log if exceeds configured threshold. A value of 0 suppresses logging of this class of threshold exceeded errors.

Security

10

Maximum ports per PM Sweep to log if exceeds configured threshold. A value of 0 suppresses logging of this class of threshold exceeded errors.

Routing

10

Maximum ports per PM Sweep to log if exceeds configured threshold. A value of 0 suppresses logging of this class of threshold exceeded errors.



Integrity Weights

PM Integrity Weight parameters control the weights for the individual counters. These are combined to form the Integrity count. You can set a PM integrity weight parameter value to 0 to ignore the counter.

The parameters shown in the following table control the weights for the individual counters which are combined to form the Integrity count. These parameters can be used in the IntegrityWeights section.

Table 32. PM IntegrityWeights Parameters

Parameter

Default Value

Description

LocalLinkIntegrityErrors

0

Weight for LocalLinkIntegrityErrors counter.

0 causes the counter to be ignored.

RcvErrors

100

Weight for RcvErrors counter.

0 causes the counter to be ignored.

ExcessiveBufferOverruns

100

Weight for ExcessiveBufferOverruns counter.

0 causes the counter to be ignored.

LinkErrorRecovery

0

Weight for LinkErrorRecovery counter.

0 causes the counter to be ignored.

LinkDowned

25

Weight for LinkDowned counter.

0 causes the counter to be ignored.

UncorrectableErrors

100

Weight for UncorrectableErrors counter.

0 causes the counter to be ignored.

FMConfigErrors

100

Weight for FMConfigErrors counter.

0 causes the counter to be ignored.

LinkQualityIndicator

40

Weight applied to the LQI normalization equation. Calculation is 2^(5-LQI)-1.

0 causes the counter to be ignored.

LinkWidthDowngrade

100

Weight applied to the LWD normalization equation. Calculation is equal to the number of active lanes down:

LinkWidth.Active - LinkWidthDowngrade.RxActive

0 causes the counter to be ignored.



Congestion Weights

The parameters shown in the following table control the weight to use for each individual counter when computing congestion, which are combined to form the Congestion count. Integrity errors can also cause congestion.

Percentage (Pct) means the specified counter is divided by the appropriate data transfer counter, then normalized to a predetermined range before being weighted and summed into the category.

Table 33. PM Congestion Weights Parameters

Parameter

Default Value

Description

XmitWaitPct

0

Weights for XmitWeight counter divided by an associated data transfer counter and normalized to a predetermined range of 0.01% to 1%.

0 causes the counter to be ignored.

CongDiscards

100

Weights for SwPortCongestion counter.

0 causes the counter to be ignored.

RcvFECNPct

5

Weights for RcvFECN counter divided by an associated data transfer counter and normalized to a predetermined range of 0.1% to 10%.

0 causes the counter to be ignored.

RcvBECNPct

1

Weights for RcvBECN counter divided by an associated data transfer counter and normalized to a predetermined range of 0.1% to 10%.

0 causes the counter to be ignored.

XmitTimeCongPct

25

Weights for XmitTimeCongestion counter divided by an associated data transfer counter and normalized to a predetermined range of 0.1% to 10%.

0 causes the counter to be ignored.

MarkFECNPct

25

Weights for MArkFECN counter divided by an associated data transfer counter and normalized to a predetermined range of 0.1% to 10%.

0 causes the counter to be ignored.



7.2.5.2.3. PM Sweep Operation Control

The parameters shown in the following table control the operation of the PM during each sweep.

Table 34. PM Sweep Parameters

Parameter

Default Value

Description

Resolution

Resolution determines the number of LocalLinkIntegrity or LinkErrorRecovery errors that must occur before the PMA will include them in the ErrorCounterSummary. Most counters are 64 bits wide and are expected to not ever saturate in the uptime of a port.

.LocalLinkIntegrity

8000000

.LinkErrorRecovery

100000

ErrorClear

7

This controls when the PM clears PMA Error counters

0 = clear when non-zero

1 = clear when 1/8 of individual counters max

2 = clear when 2/8 of individual counters max

...

7 = clear when 7/8 of individual counters max

ClearDataXfer

0

Enable clearing of Data Transfer Counters.

Clear64bit

0

Enable clearing of 64-bit Error Counters.

Clear32bit

1

Enable clearing of 32-bit Error Counters.

Clear8bit

1

Enable clearing of 8-bit Error Counters.

ProcessHFICounters

1

Enable processing (sweeping) of SuperNIC Counters.

ProcessVLCounters

1

Enable processing (sweeping) of VL Counters.

PmaBatchSize

2

Maximum concurrent PMA requests the PM can have in flight while querying the PMAs in the fabric.

MaxParallelNodes

10

Maximum nodes to concurrently issue parallel requests to a given PMA.

MaxAttempts

RespTimeout

MinRespTimeout

3

250

35

The PM will spend up to RespTimeout * MaxAttempts per packet. These allow two modes of operation.

When MinRespTimeout is non-zero, the PM will start with MinRespTimeout as the time-out value for requests and use multiples of this value for subsequent attempts if there is a time-out in the previous attempt. PM will keep retrying until the cumulative sum of time-outs for retries is less than RespTimeout multiplied by MaxAttempts. This approach is recommended and will react quickly to lost packets while still allowing adequate time for slower PMAs to respond.

When MinRespTimeout is zero, upon a time-out, up to MaxAttempts are attempted with each attempt having a time-out of RespTimeout. This approach is provided for backward compatibility with previous PM versions.

SweepErrorsLogThreshold

10

Maximum number of PMA node or Port warning messages to output per sweep with regard to nodes that cannot be properly queried.



7.2.5.2.4. PM Overrides of the Common.Shared Parameters

The Common.Shared parameters can be overridden in the PM using the parameters described in the following table.

Table 35. Additional PM Parameters

Parameter

Default Value

Description

LogLevel

2

NOTE: Overrides the Common.Shared LogLevel Settings.

Sets log level option for PM:

  • 0 = disable the vast majority of logging output

  • 1 = fatal, error, warn (syslog CRIT, ERR, WARN)

  • 2 = +notice, INFIINFO (progress messages) (syslog NOTICE, INFO)

  • 3 = +INFO (syslog DEBUG)

  • 4 = +VERBOSE and some packet data (syslog DEBUG)

  • 5 = +debug trace info (syslog DEBUG) This parameter is ignored for the Embedded Fabric Manager. For information on configuring chassis logging options, refer to the CN5000 Commands Guide, log.

LogFile

NOTE: Overrides the Common.Shared LogLevel Settings.

Sets log output location for PM. By default (or if this parameter is empty) log output is accomplished using syslog. However, if a LogFile is specified, logging is done to the given file. LogMode further controls logging. This parameter is ignored for the Embedded Fabric Manager. For information on configuring chassis logging options, refer to the CN5000 Commands Guide, log.

SyslogFacility

Local6

NOTE: Overrides the Common.Shared LogLevel Settings.

For the Host Fabric Manager, controls what syslog facility code is used for log messages. Allowed values are: auth, authpriv, cron, daemon, ftp, kern, local0-local7, lpr, mail, news, syslog, user, or uucp. For the Embedded Fabric Manager, this parameter is ignored.

ConfigConsistencyCheckLevel

2

Controls the Configuration Consistency.

Check for PM. If specified for an individual instance of PM, will override Shared settings. Checking can be completely disabled or can be set to take action by deactivating Secondary PM if configuration does not pass the consistency check criteria.

  • 0 = disable Configuration Consistency Checking

  • 1 = enable Configuration Consistency Checking without taking action (only log a message)

  • 2 = enable Configuration Consistency Checking and take action (log message and shutdown Secondary PM)

Priority

0

0 to 15, higher wins.

ElevatedPriority

0 to 15, higher wins.



7.2.5.2.5. PM Short-Term History PM Parameters

The following parameters are used to enable and customize the off-loading of RAM-resident images to disk in order to preserve counter history for a long period of time.

Table 36. Short-Term History PM Parameters

Parameter

Default Value

Description

Enable

1

Enable Short-Term History.

StorageLocation

/var/usr/lib/opa-fm/pm0_pahistory

The absolute path where the history files will be stored.

StorageLocation is a PM instance-specific parameter (similar to TcpPort for the FE).

The default value is /var/usr/lib/opa-fm/pm0_pahistory (where '0' is the number of the instance). If there are multiple instances of the PM, then StorageLocation must be unique for each of them.

TotalHistory

24

The total number of hours of history that will be stored. This is a limit on the total amount of data stored, and not a limit on a file's age. PM downtime does not count toward the total time.

ImagesPerComposite

3

Determines how many images will be compounded into a single image as part of writing to the file. A higher number will save disk space but will result in lower data granularity. Must not be 0.

MaxDiskSpace

1024

A cap on how much disk space (in MiB) the short-term history is allowed to use. If this size is exceeded, the oldest files will be deleted to save space.

CompressionDivisons

8

Determines how many divisions will be used to concurrently compress or decompress data. Recommend less than or equal to number of processing cores of the management node, which must not exceed 32.



7.2.5.2.6. PM/PA Fail-over Parameters

The following parameters are used to enable and control the PM Fail-over extension.

Table 37. PM/PA Fail-over Parameters

Parameter

Default Value

Description

ImageUpdateInterval

5

ImageUpdateInterval is defined as the interval at which the Primary PM updates Standby PMs with an image (when an image is available); otherwise, image updates occur as they are available. Note that if multiple Standby PMs exist, the Primary PM updates each Standby concurrently during the ImageUpdateInterval. In the event PA Short-Term History is enabled on all PMs, the Primary PM will update all Standby PMs with disk-resident images once all RAM-resident images have been updated.

ImageUpdateInterval must be less than the PM SweepInterval in order for the Primary PM to be able to send images fast enough to keep up with new images as well as catch up on older images. If ImageUpdateInterval is greater than SweepInterval, then ImageUpdateInterval is set equal to SweepInterval and a warning message is logged.

Setting ImageUpdateInterval to 0 turns off the transfer of images to Standby PMs.

The default value for ImageUpdateInterval is based upon the (SweepInterval / 2) rounded down to the nearest integer; must be at least 1.



7.2.5.2.7. Additional PM Parameters for Debug and Development

The PM supports the parameters in the following table to aid diagnosis and debugging. Only use these parameters under the direction of your support representative.

Table 38. PM Debug Parameters

Parameter

Default Value

Description

Debug

0

NOTE: Overrides Debug setting from Common.Shared.

Additional parameters for debug/development use. This enables debugging modes for PM.

RmppDebug

0

NOTE: Overrides RmppDebug setting from Common.Shared.

If 1, then log additional PM info with regards to RMPP (Reliable Message Passing Protocol).

CS_LogMask

MAI_LogMask

CAL_LogMask

DVR_LogMask

IF3_LogMask

SM_LogMask

SA_LogMask

PM_LogMask

PA_LogMask

FE_LogMask

APP_LogMask

0x00000000

0x000001ff

0x000001ff

0x000001ff

0x000001ff

0x000001ff

0x000001ff

0x000001ff

0x000001ff

0x000001ff

0x000001ff

Alternative to use of LogLevel. For advanced users, these parameters can provide more precise control over per subsystem logging. For typical configurations, these should be omitted and the LogLevel parameter should be used instead.

For each subsystem, there can be a LogMask. The mask selects severities of log messages to enable and is a sum of the following values:

0x1=fatal

0x2=actionable error

0x4=actionable warning

0x8=actionable notice

0x10=actionable info

0x20=error

0x40=warn

0x80=notice

0x100=progress

0x200=info

0x400=verbose

0x800=data

0x1000=debug1

0x2000=debug2

0x4000=debug3

0x8000=debug4

0x10000=func call

0x20000=func args

0x40000=func exit

For embedded Fabric Manager, the corresponding Chassis Logging must also be enabled and SM configuration applies to all managers.

For Host Fabric Manager, the Linux syslog service will need to have an appropriate level of logging enabled.