Linux kernel netdevice
in the guestΒΆ
This is another example where the packets are transmitted between a host
running Virtio Host PMD and a guest running a Linux with a virtio-net
kernel module.
A 6WINDGate DPDK application runs on one core on the host
The guest runs a
virtio-net
kernel module, on 4 coresPackets are sent/received using 4 rings from host to guest and 4 rings from guest to host
Notification when the host sends packets to the guests
On the host, reserve huge pages and mount a
hugetlbfs
file system.# echo 4096 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages # mkdir -p /mnt/huge-2M && mount -t hugetlbfs none /mnt/huge-2M
Start the 6WINDGate DPDK application on the host.
# QUEUE_CONFIG="rxqmap=manual:rr/1:3:5:7,txqmap=manual:rr/0:2:4:6" # cd /path/to/dpdk/x86_64-native-linuxapp-gcc # ./app/testpmd -c 0x3000 -n 3 --socket-mem=0,512 --huge-dir=/mnt/huge-2M \ -d /path/to/librte_pmd_vhost.so \ --vdev pmd-vhost0,${QUEUE_CONFIG},sockname=/tmp/vhost_sock0 \ -- --socket-num=1 --port-numa-config=0,1 --port-topology=chained -i
We use
testpmd
, provided in the 6WINDGate DPDK package. In this example, it uses 2 cores (mask is 0x3000) but only one is used for the data plane, the other is for the command line.Note
The argument
sockname=/tmp/vhost_sock0
specifies the path of thevhost-user
Unix socket to create. QEMU will connect to this socket to negociate features and configure Virtio Host PMD.On the host, configure
testpmd
to answer to all ICMP echo requests:testpmd> set fwd icmpecho testpmd> start
Start QEMU, simulating a 4 cores machine with a
vhost-user
virtio device providing 4 pairs of queues:# numactl --cpunodebind=1 --membind=1 \ qemu-system-x86_64 --enable-kvm -k fr -m 2G \ -cpu host -smp cores=4,threads=1,sockets=1 \ -serial telnet::4445,server,nowait -monitor telnet::5556,server,nowait \ -hda vm.qcow2 \ -object memory-backend-file,id=mem,size=2G,mem-path=/mnt/huge-2M,share=on \ -numa node,memdev=mem \ -chardev socket,path=/tmp/vhost_sock0,id=chr0,server \ -netdev type=vhost-user,id=net0,chardev=chr0,queues=4 \ -device virtio-net-pci,netdev=net0,vectors=9,mq=on,ioeventfd=on
Note
To manage more than one pair of queues, you must patch QEMU.
For maximum performance, every pthread of the QEMU process (except management pthreads) has to be pinned to a CPU and must be the only application running on that CPU. This can be done with several
tasksets
. The pthread IDs associated with vCPU can be retrieved from the QEMU console using theinfo cpu
command.QEMU must support
vhost-user
, which is the case starting from QEMU 2.1.
In the guest, the virtual
virtio
PCI device should be listed:# lspci -nn [...] 00:03.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000]
The
virtio-net
kernel module should be automatically loaded. Configure the network in this interface with 4 queues and an IP address:# ETH=eth0 # ls /sys/class/net/${ETH}/queues # ethtool -L ${ETH} combined 4 # ethtool -l ${ETH} # ip a a 1.1.1.1/24 dev ${ETH} # ip l set ${ETH} up # arp -s 1.1.1.2 00:00:00:00:00:01 # ping 1.1.1.1
Configure the affinity of the MSI-x interrupts to increase performance, spreading
virtio0-input.X
on all cores:# service irqbalance stop # grep virtio0-input /proc/interrupts # echo 01 > /proc/irq/41/smp_affinity # echo 02 > /proc/irq/43/smp_affinity # echo 04 > /proc/irq/45/smp_affinity # echo 08 > /proc/irq/47/smp_affinity
In this example, the packets sent by the host will be distributed over the 4 Linux cores in round-robin mode.
Note
To spread TX packets over multiple queues in the guest, use the Linux XPS
feature (available in Linux version 2.6.38 and higher). You must enable the
Linux kernel configuration item CONFIG_XPS
. This item allows selecting a
transmission queue from the CPU that transmits a packet. Without XPS, you can
use only one queue when transmitting over a virtio
interface.
To configure XPS, specify the CPU/queue association via the sysfs
interface:
# echo ${COREMASK} > /sys/class/net/${DEV}/queues/tx-${QNUM}/xps_cpus
In the example above, to associate cpu0
to queue0
, cpu1
with
queue1
, and so on, enter:
# echo 1 > /sys/class/net/eth0/queues/tx-0/xps_cpus
# echo 2 > /sys/class/net/eth0/queues/tx-1/xps_cpus
# echo 4 > /sys/class/net/eth0/queues/tx-2/xps_cpus
# echo 8 > /sys/class/net/eth0/queues/tx-3/xps_cpus
See also
The 6WINDGate DPDK documentation