4.4. System Information

4.4.1. CPU Pinning for VMs

For each virtual CPU, QEMU uses one pthread for actually running the VM and pthreads for management. For best performance, you need to make sure cores used to run fast path dataplane are only used for that.

To get the threads associated with each virtual CPU, use info cpus in QEMU monitor console:

QEMU 2.3.0 monitor - type 'help' for more information
(qemu) info cpus
* CPU #0: pc=0xffffffff8104f596 (halted) thread_id=26773
  CPU #1: pc=0x00007faee19be9f9 thread_id=26774
  CPU #2: pc=0xffffffff8104f596 (halted) thread_id=26775
  CPU #3: pc=0x0000000000530233 thread_id=26776

To get all threads associated with your running VM (including management threads) and what CPU they are currently pinned on, call:

# taskset -ap <qemu_pid>
pid 26770's current affinity mask: f
pid 26771's current affinity mask: f
pid 26773's current affinity mask: f
pid 26774's current affinity mask: f
pid 26775's current affinity mask: f
pid 26776's current affinity mask: f
pid 27053's current affinity mask: f

By pinning our VM on a specific set of cores, we ensure less overload.

You may either run qemu with a specific set of cores when starting, using:

# taskset -c 0-1 <qemu command>

You may also pin a VM after it has been started, using the PID of its threads. For instance, to change the physical CPU on which to pin the virtual CPU #0, use:

# taskset -cp 0-1 26773
pid 26773's current affinity list: 0-3
pid 26773's new affinity list: 0,1

Note

Refer to the taskset manpage for specific options.

When using libvirt, you may use <cputune> with vcpupin to pin virtual CPUs to physical ones. e.g.:

<vcpu cpuset='7-8,27-28'>4</vcpu>
<cputune>
  <vcpupin vcpu="0" cpuset="7"/>
  <vcpupin vcpu="1" cpuset="8"/>
  <vcpupin vcpu="2" cpuset="27"/>
  <vcpupin vcpu="3" cpuset="28"/>
</cputune>

Note

Refer to the libvirt Domain XML format documentation for further details.

We can look at htop results (after filtering results for this qemu instance) to confirm what threads are actually used:

  PID USER       VIRT   RES   SHR S CPU% MEM%   TIME+  NLWP Command
26770 mazon     7032M 4067M  7092 S 200. 25.5  2h19:55    7 |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...
27053 mazon     7032M 4067M  7092 S  0.0 25.5  0:01.13    7 |  |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...
26776 mazon     7032M 4067M  7092 R 99.1 25.5  1h04:38    7 |  |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...
26775 mazon     7032M 4067M  7092 S  0.9 25.5  2:48.21    7 |  |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...
26774 mazon     7032M 4067M  7092 R 99.1 25.5  1h09:42    7 |  |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...
26773 mazon     7032M 4067M  7092 S  0.0 25.5  2:23.03    7 |  |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...
26771 mazon     7032M 4067M  7092 S  0.0 25.5  0:00.00    7 |  |- qemu-system-x86_64 -daemonize --enable-kvm -m 6G -cpu host -smp sockets=1,cores=4,threads=1 ...

You may even change CPU affinity by typing a when on a specific PID line in htop.

Similarly, you can get threads PID by looking in /proc/<pid>/task/, e.g.:

# ls /proc/26773/task
26770/  26771/  26773/  26774/  26775/  26776/  27053/

4.4.2. Hardware statistics

The show interface hardware-statistics command is used to display and set options related to network drivers (for those that support it).

To display the statistics for a given port, use show interface hardware-statistics:

# show interface hardware-statistics
...
------------- iface1 -------------
rx_good_packets: 11246553
tx_good_packets: 11272871
rx_good_bytes: 3925735284
tx_good_bytes: 3925615259
rx_missed_errors: 0
rx_errors: 6
tx_errors: 0
rx_mbuf_allocation_errors: 0
rx_q0_packets: 11246553
rx_q0_bytes: 3925735284
rx_q0_errors: 0
tx_q0_packets: 11272871
tx_q0_bytes: 3925615259
...
rx_total_packets: 192087588
rx_total_bytes: 14776198828
tx_total_packets: 11272871
...

The most important stats to look at are the {r,t}x_good_{packets,bytes} and errors such as {r,t}x_errors, rx_mbuf_allocation_errors, rx_missed_errors.

They indicate globally how well the port is handling packets.

There is also per queue statistics that might be interesting in case of multiqueue. It’s better to have packets transmitted on as many different queues as possible, but it depends on various factors, such as the IP addresses and UDP / TCP ports.

The drop statistics provide useful information about why packets are dropped. For instance, the rx_missed_errors counter represents the number of packets dropped because the CPU was not fast enough to dequeue them. A non-zero value for rx_mbuf_allocation_errors shows that there is not enough mbuf structure configured in the fast path.

Note

Statistics field names may vary considering the driver.

show interface hardware-features name <interface> can be used to check whether offload is enabled, using the following:

vsr> show interface hardware-features name iface1
TX vlan insert off [fixed]
TX IPv4 checksum off [fixed]
TX TCP checksum off [fixed]
TX UDP checksum off [fixed]
TX SCTP checksum off [fixed]
TX TSO off [fixed]
TX UFO off [fixed]
RX vlan strip off
RX vlan filter off
RX IPv4 checksum on
RX TCP checksum on
RX UDP checksum on
RX MPLS IP off
RX LRO off
RX timestamp off
RX GRO off

If you want to list every features of every interface, you can just use show interface hardware-features

If you want to enable TSO (which should provide you with better performance for TCP, as the hardware will handle the segmentation), use:

# fp-cli dpdk-port-offload-set eth1 tso on

Note

You can get various error messages when trying to change hardware parameters. For instance, Cannot change tcp-segmentation-offload may appear if the driver does not support to dynamically change TSO offload.

Note

The offload configuration is not synchronized between the tap interface in Linux and the fast path port. Thus, refer to fp-cli for displaying and modifying the hardware parameters, instead of Linux tools (i.e ethtool).

4.4.3. Network interfaces

The show state network-port command is useful to display information about NIC.

vsr> show state network-port
network-port pci-b131s0
 bus-addr 0000:83:00.0
 vendor "Intel Corporation"
 model "82599ES 10-Gigabit SFI/SFP+ Network Connection"
 mac-address 90:e2:ba:19:47:f4
 interface iface1
 ..
network-port pci-b137s0
   bus-addr 0000:89:00.0
   vendor "Intel Corporation"
   model "82599ES 10-Gigabit SFI/SFP+ Network Connection"
   mac-address 90:e2:ba:68:85:dc
   interface iface2
   ..

4.4.4. Hardware Topology

The troubleshooting-report command retrieves the hardware topology of the system. It gives informations about shared caches, CPU’s, processor cores, threads and much more.

The hardware topology can be found in the archive created by this command, in two formats:

  • XML version in lstopo.xml (contains raw data in XML format).

  • SVG version in lstopo.svg (contains a schematic visual representation, can be opened with Firefox for example).

../_images/lstopo.svg

You can also use the show hardware command, to display informations about CPU, RAM, NIC and disks.

vsr> show hardware
System serial number: 186DWY3


CPUs:
Model                                      Vendor      Frequency Serial Cores Threads
=====                                      ======      ========= ====== ===== =======
Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz Intel Corp. 2.10GHz   n/a       12      24
Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz Intel Corp. 2.10GHz   n/a       12      24


Memory units:
Description                                                   Model              Vendor       Serial   Storage size
===========                                                   =====              ======       ======   ============
System Memory                                                 n/a                n/a          n/a      64GiB
DIMM DDR4 Synchronous Registered (Buffered) 3200 MHz (0.3 ns) 18ASF2G72PDZ-3G2R1 002C0632002C 3FEB05F4 16GiB
DIMM DDR4 Synchronous Registered (Buffered) 3200 MHz (0.3 ns) 18ASF2G72PDZ-3G2R1 002C0632002C 3FEB0815 16GiB
DIMM DDR4 Synchronous Registered (Buffered) 3200 MHz (0.3 ns) 18ASF2G72PDZ-3G2R1 002C0632002C 3FEB0638 16GiB
DIMM DDR4 Synchronous Registered (Buffered) 3200 MHz (0.3 ns) 18ASF2G72PDZ-3G2R1 002C0632002C 3FEB0864 16GiB


Network interfaces:
Description        Model                                   Vendor                         MAC address       Speed Driver Firmware                 Port
===========        =====                                   ======                         ===========       ===== ====== ========                 ====
Ethernet interface NetXtreme BCM5720 Gigabit Ethernet PCIe Broadcom Inc. and subsidiaries c4:cb:e1:a7:13:f6 1Gb/s tg3    FFV22.31.6 bc 5720-v1.39 pci-b4s0


Disk units:
Description Model            Vendor  Serial       Storage size
=========== =====            ======  ======       ============
ATA Disk    Micron_5200_MTFD n/a     190220540682 894GiB
NVMe disk   n/a              n/a     n/a          894GiB

4.4.5. Memory

The command show state system linux memory presents a short memory status summary.

On a physical machine running with 64GB or RAM

vsr> show state system linux memory
memory
    available 64762789888
    total 67400708096
..

Note

MemTotal

The total physical memory installed on the machine in bytes.

MemAvailable

Estimate of how much memory is available (in bytes) for starting new applications without swapping.

4.4.6. NUMA statistics

This command displays per-node NUMA hit and miss system statistics from the kernel memory allocator.

vsr> show numa statistics
======= ======== ========= ============ ============== ========== ==========
Node id NUMA hit NUMA miss NUMA foreign Interleave hit Local node Other node
node0    1695092         0            0           2116    1695092          0
node1   58924643         0            0           7511   58924643          0