Checking NUMA topology and affinity

NUMA topology is shown using the command below:

[root@cld-dfa-gpu-04 ~]# lscpu | grep NUMA
NUMA node(s):        2
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111

similar output is obtained with the command below (after yum install numactl):

[root@cld-dfa-gpu-04 ~]# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110
node 0 size: 192368 MB
node 0 free: 182480 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99 101 103 105 107 109 111
node 1 size: 193520 MB
node 1 free: 185797 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

in order to check the CPU affinity of the GPU(s), issue the command below:

[root@cld-dfa-gpu-04 Samples]# nvidia-smi topo -m
        GPU0    GPU1    CPU Affinity    NUMA Affinity
GPU0     X      SYS     0,2,4,6,8,10    0
GPU1    SYS      X      1,3,5,7,9,11    1

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

the CPU affinity shown above is truncated, another way to show it is looking at the file below:

[root@cld-dfa-gpu-04 ~]# cat /sys/bus/pci/devices/0000:17:00.0/local_cpulist
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110

[root@cld-dfa-gpu-04 ~]#  cat /sys/bus/pci/devices/0000:ca:00.0/local_cpulist
1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111

where 0000:17:00.0 and 0000:ca:00.0 identify the GPUs through their Bus ID (found e.g. with "lspci | grep NVI" command).

to verify which GPU is attached to a VM and its associated NUMA node, execute from the hypervisor the following commands:

[root@cld-dfa-gpu-04 ~]# virsh dumpxml instance-001f81d1
...
   <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x17' slot='0x00' function='0x0'/>
      </source> 

[root@cld-dfa-gpu-04 ~]# virsh nodedev-dumpxml pci_0000_17_00_0
...
    <product id='0x25b6'>GA107GL [A2 / A16]</product>
    <vendor id='0x10de'>NVIDIA Corporation</vendor>
    <capability type='virt_functions' maxCount='16'/>
    <iommuGroup number='20'>
      <address domain='0x0000' bus='0x17' slot='0x00' function='0x0'/>
    </iommuGroup>
    <numa node='0'/>

i.e. look at the hostdev with the driver vfio to find the source PCI bus ID, and then look in more details at the pci_0000_17_00_0 device to find the GPU model and the NUMA node associated to it.

Space shortcuts

Page tree