6.9.2. NVIDIA GPU Tunings
For detailed information on NVIDIA GPU support, refer to the CN5000 Fabric Installation Guide, NVIDIA GPU Requirements.
6.9.2.1. IOMMU
For GPUDirect to function properly, NVIDIA recommends disabling PCIe Access Control Services (ACS), also known as IO virtualization or IOMMU. If left enabled, unpredictable behavior such as application failures may be experienced. Refer to the NVIDIA documentation, PCI Access Control Services, for more information regarding this setting. Additionally, on AMD CPU platforms, you must set iommu=pt as a kernel boot parameter.
6.9.2.2. Persistence Mode (Legacy)
Persistence mode is recommended for NVIDIA GPU‑accelerated applications, especially short‑lived jobs, because it keeps the GPU driver loaded even when no active CUDA contexts exist. This eliminates repeated driver initialization, reduces startup latency, and ensures stable, predictable performance across workloads. You can query the current persistence mode setting using:
nvidia-smi --query-gpu=persistence_mode --format=csv
If persistence mode is not already enabled, it can be activated with:
nvidia-smi -pm 1
To disable persistence mode and allow the driver to unload when idle, use:
nvidia-smi -pm 0