3.2.3. Performance Tuning¶
Slow packet processing¶
- Symptoms
Packet processing performance is not as high as expected.
- Hints
Follow the advice provided in
fast-path.sh config -i
when using the advanced configuration.If running in a VM, check that the
qemu
instance handling your VM is pinned on specific cores. See CPU Pinning for VMs section for details.Check the output from
fast-path.sh config --dump
, optionally with the--long
option, looking for the number of enabled cores (FP_MASK
). You might need to add more, especially iffp-cpu-usage
shows that enabled cores are all at 100%. Seefp-cpu-usage
section for details. Refer to the Fast Path Baseline for further details on the wizard (fast-path.sh config
).Check the output from
fast-path.sh config --dump
, optionally with the--long
option, making sure it is consistent with the available resources in the system. Typically, use memory linked with the appropriate socket considering your system.Check that your fast path configuration (memory, sockets, hugepages, …) fits your current system. Look at lstopo output to see if your current configuration is consistent.
Check whether offload is supported and enabled on your port, using
fp-cli dpdk-port-offload
. See fp-cli dpdk-port-stats section for details.Check that
exception stats
fast path statistics are not too high. Basic exceptions indicate how many packets could not be processed by the fast path, and have thus been injected in the Linux stack for slow path processing. If the value is high, it is a good indicator that IP addresses/routes/tunnels in the fast path are badly configured. See Fast Path statistics section for an example of stats. Refer to the Fast Path Baseline documentation for further details onfp-cli
commands.Check the dynamic core/port mapping (which core poll which port) using
fp-cli dpdk-core-port-mapping
(DPDK only), and ensure that all cores will get work from NICs.Check what functions the fast path is spending most of its time in, using
perf top
. See perf section for details.
Netfilter interferences¶
- Symptoms
Packet processing performance is not as high as expected.
Netfilter or Ebtables are enabled:
# fp-cli nf4 IPv4 netfilter is on # fp-cli filter-bridge filter bridge is on
Libvirt created automatically netfilter rules along with its
virbr0
interface. For instance:root@dut-vm:~# iptables-save
# Generated by iptables-save v1.4.21 on Mon Apr 4 14:28:56 2016 *mangle :PREROUTING ACCEPT [8:974] :INPUT ACCEPT [8:974] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [6:432] :POSTROUTING ACCEPT [6:432] -A POSTROUTING -o virbr0 -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill COMMIT # Completed on Mon Apr 4 14:28:56 2016 # Generated by iptables-save v1.4.21 on Mon Apr 4 14:28:56 2016 *nat :PREROUTING ACCEPT [0:0] :INPUT ACCEPT [0:0] :OUTPUT ACCEPT [6:432] :POSTROUTING ACCEPT [6:432] -A POSTROUTING -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN -A POSTROUTING -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535 -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535 -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE COMMIT # Completed on Mon Apr 4 14:28:56 2016 # Generated by iptables-save v1.4.21 on Mon Apr 4 14:28:56 2016 *filter :INPUT ACCEPT [8:974] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [6:432] -A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT -A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT -A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT -A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT -A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT -A FORWARD -i virbr0 -o virbr0 -j ACCEPT -A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable -A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable -A OUTPUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT COMMIT # Completed on Mon Apr 4 14:28:56 2016
- Hints
Netfilter rules impact performance. If your rules only affect traffic going to Linux and not dataplane traffic, you may want to disable Netfilter synchronization using the following commands:
# fp-cli nf4-set off IPv4 netfilter is off # fp-cli filter-bridge-set off filter bridge is off
Performance drop with Mellanox ConnectX-3 devices¶
- Symptoms:
Packet processing is slower than expected
- Hints:
On Dell and SuperMicro servers, PCI read buffer may be misconfigured for ConnectX-3/ConnectX-3-Pro NICs. Check the output of
setpci -s <NIC_PCI_address> 68.w
. For instance:# lspci | grep Mellanox 04:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] # setpci -s 04:00.0 68.w 202e
Warning
Beware with the following command, it is known to cause spontaneous reboot on some systems.
If the value is below 0x5020 (here that’s the case), set it to 0x5020:
# setpci -s 04:00.0 68.w=5020
Performance optimization for the NFV use case¶
- Symptoms:
Expecting more performance in the NFV use case
- Hints:
The nfv profile can used by setting it in the
/etc/fp-vdev.ini
file or by passing –profile=nfv to fp-vdev in order to increase the performance of the packet processing, especially when:the VM runs a fast path or a DPDK application.
the VM does not terminate the traffic and does not require the offload.
the number of cores in the host is higher than the number of cores in the guest.
This option will lower the lock contention on Tx rings, at the price of degrading the packets spreading among guest cores. It also disables the mergeable buffer Virtio feature.
It is possible to check if the nfv profile is set by checking the used configuration to create the virtual interface with the fp-shmem-ports command.
# fp-shmem-ports -d
[snip]
port 2: tap0-vrf0 [snip] driver net_6wind_vhost (args sockname=/tmp/tap0.sock,sockmode=client,profile=endpoint) [snip]
[snip]
port 3: tap1-vrf0 [snip] driver net_6wind_vhost (args sockname=/tmp/tap1.sock,sockmode=client,profile=nfv) [snip]
On the above example the tap0 created by fp-vdev does not have the nfv profile whereas the tap1 has it.
See also
The 6WINDGate Fast Path Managing virtual devices documentation for more information about the
fp-vdev
command.