Usage¶
FPN-SDK Add-on for DPDK automatically starts when you start the fast path with the following script:
# fast-path.sh start
See also
For more information on how to start the fast path, see the Fast Path Baseline documentation.
Providing options¶
The FPN-SDK can take several arguments on the fast path command line.
Most of the usual options are automatically added by the fast path start script
while parsing the fast-path.env
configuration file.
The configuration variables EAL_OPTIONS
(for DPDK EAL options) and
FPNSDK_OPTIONS
(for FPN-SDK options) of the fast path configuration file may
contain additional options that are described here.
Useful DPDK EAL options¶
Here are the most useful EAL options to set in the EAL_OPTIONS
variable:
- --log-level=<LOGLEVEL>¶
Available since dpdk-1.8.0. Optional. Set the dpdk log level. When set to n, all dpdk drivers messages with a log level equal or below n are printed. LOGLEVEL can have one of the following values:
1:
EMERG
System is unusable.2:
ALERT
Action must be taken immediately.3:
CRIT
Critical conditions.4:
ERR
Error conditions.5:
WARNING
Warning conditions.6:
NOTICE
Normal but significant condition.7:
INFO
Informational.8:
DEBUG
Debug-level messages.
Any other EAL option can be passed by modifying the EAL_OPTIONS
variable.
See also
For the full list of options, see the DPDK documentation.
FPN-SDK options¶
Here are the least common FPN-SDK options to set in the FPNSDK_OPTIONS
variable:
Log options¶
Parameters
- --logmode=[console|syslog]¶
Optional. Specify where the log of the fast path should be display: console or syslog. By default, the log are displayed with syslog.
- --debug-rpc¶
If specified, enable fpn-rpc debug logs.
- --no-shm-hugetlb¶
If specified, do not attempt to map shared memory in hugepages.
Vlan stripping option¶
Parameters
- --vlan-strip¶
Optional. Enable the VLAN header stripping feature on incoming frames if supported by the hardware. By default, vlan stripping feature is disabled.
Alternately, you can specify
: ${VLAN_STRIP:=on}
in the configuration file.
Port options¶
Parameters
- --nb-rxq=[all|phys|virt]:[number of rx queues]¶
Optional. Specify the number of queues per port for each device type. Useful when using automatic mapping of cores to ports. By default, the maximum of queues is allocated to a port. Cannot be used together with -t.
The different types of device that can be configured are
phys
(i.e. any physical network device like Intel or Mellanox NICS),virt
(i.e. any virtual network device like vhost-user ports).
Core / Port binding¶
- -t [<core number>=<port number>/<core number>=<port number>:<port number>]¶
- core number
Fast path logical core number preceded by
c
.- port number
Number of port to poll. For each occurrence, one RX queue is created.
Specify how many RX queues logical cores poll on ports. Unspecified logical cores and ports are idle.
It is preferred to set the core / port binding in the fast path configuration file using the option
: ${CORE_PORT_MAPPING:=<value>}
.If this parameter is not set, each core polls all the ports on the same numa socket. If there is not enough RX queues, only a subset of these cores will poll the port. If the maximal number of TX queues is smaller than the number of fast path cores, the port is configured with only one shared TX queue. The total number of queues per port must not exceed the NIC’s hardware limit, else only shared queue will be used.
The hardware is responsible for distributing flows over queues (for instance, via RSS on Intel or Arm platforms).
This parameter overrides the
--nb-rxq
options described above.
Offloads¶
Description
To support offload features such as TSO or L4 checksum offloading to the NIC,
or forwarding offload information from a guest to the NIC through a virtual
interface, you must enable offloading in the fast path. You can then tune the offload
features more precisely using ethtool -K <iface> <feature> on|off
.
- --offload¶
Enable the offload feature in the fast path.
Example
###########################
##### FPN-SDK OPTIONS #####
###########################
:${FPNSDK_OPTIONS:=--offload}
TX scatter¶
Parameters
- --tx-scatter¶
Optional. Enable the TX scatter feature if it is supported by the hardware. By default, TX scatter is disabled. If offload feature is enabled, TX scatter feature is enabled too.
Cryptography options¶
Description
You can tune the maximum available cryptographic sessions and the number of buffers allocated for cryptography.
- --crypto-max-sessions¶
Tune maximum available cryptographic sessions. Cryptography code will reserve space to store sessions in pools. If we are doing IPsec, we need one session per sa.
- --crypto-buffers¶
Tune number of buffers allocated for cryptography. Crypto library will allocate buffers from pools to store per buffer crypto parameters. This parameter is the size of the buffer pool.
Example
###########################
##### FPN-SDK OPTIONS #####
###########################
:${FPNSDK_OPTIONS:= --crypto-max-sessions=8192 --crypto-buffers=32768}
Cryptographic offloading mask¶
Description
Specify which fast path cores are able to do cryptographic operations for other cores. This cryptographic offloading is done only between core on the same numa node. By default crypto offloading is done only for decrypt operations and to cores that are not receiving traffic at this moment.
- -i <crypto_offloading mask>¶
Specify the fast path cores available for crypto offloading. By default, all cores specified in
fp_mask
are used. The format of the mask can be a list of cpus or a mask starting with 0x. Two specific strings are also available: none to disable the crypto offloading feature and all (the default value). It is preferred to set the cryptographic offloading mask in the fast path configuration file using the option : ${CRYPTO_OFFLOAD_MASK:=<value>}.
Example
: ${CRYPTO_OFFLOAD_MASK:=none}
Statistics of offloaded cryptographic operations can be retrieved with:
crypto-offload-stats
It is possible to enable/disable the cryptographic offload for packets encryption. Be aware that enabling encyption offload for an IPsec tunnel that aggregates many flows managed by several fast path cores can cause IPsec errors due to the anti-replay window.
crypto-offload-encrypt-set on|off
It is possible to disable the cryptographic offload for small packets by setting the threshold of minimal size of data
crypto-offload-threshold-set <min>
- <min>
Minimal size of data to offload cryptographic operations. By default any packet can be offloaded (size set to 0)
To avoid to send cryptographic offloading to core that receives high traffic, it is possible to disable the cryptographic offload to overloaded cores during a configurable time period.
crypto-offload-timer-set <time>
- <time>
Time, in microseconds, during an overloaded core is ignored for cryptographic offloading. Default value is 10 milliseconds
Intercore options¶
Description
You can tune the size of the intercore rings, used for pipeline implementation.
- --intercore-ring-size¶
One ring is allocated for each core, to store messages sent from other cores. Use this option to change the size of the rings. Default value is
CONFIG_MCORE_INTERCORE_RING_SIZE
, as specified in/etc/6WINDGate/fpnsdk.config
.Alternately, you can specify
: ${INTERCORE_RING_SIZE:=<value>}
in the configuration file.
Intercore implementation¶
The following functions have been added to the DPDK mbuf
api:
m_set_process_fct()
m_call_process_fct()
These functions use mbuf
headroom to store a fpn_callback
structure.
To avoid updating mbuf
internal pointers, m_prepend/m_adj
functions are not
used, yet the checks on available lengths in headroom are the same.
Since we don’t call m_prepend/m_adj
, once m_set_process_fct()
has been
called, no operation can be done on mbuf
until m_call_process_fct()
is
called.
Software scheduling implementation¶
The software scheduling API is implemented on top of the DPDK
librte_sched
library.
Linux / fast path communication¶
Parameters
- -l <exception mask>¶
Specify the fast path cores involved in polling packets coming from Linux over fpvi interfaces. By default, all cores specified in
fp_mask
are used. All packets locally sent by the Linux stack are forwarded to the fast path and processed by the selected cores.
- --nb-fpvi-queue=[all|phys|virt|excp]:<number of queues>¶
Specify the number of queues per fpvi port. Each port in fast path has a tuntap netdevice used to send/receive packets between fast path and linux kernel. The number of queues of the tuntap device can be configured with a different value in function of the fast path port devtype associated. The different devtypes for the fast path are: configured are
phys
(i.e. any physical network device like Intel or Mellanox NICS),virt
(i.e. any virtual network device like vhost-user ports),excp
(i.e. for fpn0).
RX/TX descriptors and thresholds¶
Parameters
- --max-gro-mbufs=<max-phys-mbuf-number,max-virt-mbufs-number,>¶
Specify the maximum number of mbufs that can be stored in the GRO reassembly queues for physical and virtual ports. This is the value per port and per core. It means that the total number of mbufs required for the GRO reassembly will be (
the total number of physical ports
*MAX_GRO_MBUFS[1]
+the total number of virtual ports
*MAX_GRO_MBUFS[1]
) *the total number of cores
.This parameter helps to allocate more precisely the right amount of mbufs required for the good working of the
fast-path
.The value should contain 2 integers separate by a coma. The first integer is for physical ports and the seconde one is for the virtual ports.
Each integer must be between 0 and 65535.
- --max-vports=<max number of dynamic vdev>¶
Specify the maximum number of vports that will be created using the
fp-vdev
command. Increasing this value will increase the preallocated mbuf numbers.This value cannot exceed 300 and the total number of ports (physical / virtual) can not be more than
CONFIG_MCORE_FPN_MAX_PORTS
- --nb-mbuf=<total mbuf number>|<sock0-mbuf-number,sock1-mbufs-number,...>¶
Specify the number of
mbufs
to add in the pool (default is 16384).It is preferred to set the number of mbufs in the fast path configuration file using the variable
: ${NB_MBUF:=<value>}
, as it allows to set the value toauto
which lets the wizard calculate how many mbufs are required. Indeed, the number of needed mbufs can not be easily computed manually as some features like TCP use additional mbufs.Optimal performance is reached when there are as few mbufs as possible. However, mbuf allocation failure can lead to unexpected behavior.
See also
If vhost ports are created, additional mbufs should be pre-allocated. This
is automatically done by the wizard when setting : ${MAX_VPORT:=<value>}
in fast-path.env
.
Each soft-queue can hold at most the soft tx queue size of mbuf. This size
can be found in fast-path.env
defined with : ${SOFT_TXQ_SIZE:=<value>}
.
Therefore NB_MBUF
has to be increased by SOFT_TXQ_SIZE * (MAX_VPORT + PHYS_PORTS)
where PHYS_PORTS
is the number of physical ports.
- --mbuf-rx-size=<size>¶
Specify the size of Rx data (excluding headroom) inside each
mbuf
. The default value is 2176. Changing this value may affect performance.The equivalent option in the fast path configuration file is
: ${MBUF_RX_SIZE:=<value>}
.
- --nb-rxd=[RX descriptor number]¶
Optional. Specify the number of RX descriptors allocated to the NIC. It is highly recommended to use a power of 2 to be compliant with any PMD. The minimal value is 64. Default is 128.
Important
For Intel Ethernet Controller XL710, specify 1024 RX descriptors.
- --nb-txd=[TX descriptor number]¶
Optional. Specify the number of TX descriptors allocated to the NIC. It is highly recommended to use a power of 2 to be compliant with any PMD. The minimal value is 64. Default is 512.
- --nb-rxdtxd-fpvi=[all|phys|virt|excp]:<FPVI RXTX descriptor number>¶
Optional. Specify the number of RX/TX descriptors allocated to the fpvi NIC. It is highly recommended to use a power of 2. The minimal value is 64. Default value is 512.
Specify the number of descriptors in each RX/TX fpvi queue. Each port in fast path has a tuntap netdevice used to send/receive packets between fast path and linux kernel. The number of descriptors of the tuntap device can be configured with a different value depending on the fast path port devtype associated. The different devtypes for the fast path ports are:
phys
(i.e. any physical network device like Intel or Mellanox NICS).virt
(i.e. any virtual network device like vhost-user ports).excp
(i.e. internal exception port: fpn0).
- --soft-queue=[<port number>=<additional TX desc number>/default=<additional TX desc number>]¶
Optional. Add a software queue at FPN-SDK level. This is particularly useful for NICs where it is not possible to configure the number of Tx descriptors. For performance purpose this field must be a power of 2. The maximal value is 32768. Default is 0 (i.e. no software queue).
- port number
Fast path port number preceded by p or default to apply the configuration to all ports
- additional TX desc number
Number of additional TX descriptors allocated at FPN-SDK level.
Alternately, you can specify
: ${SOFT_TXQ_SIZE:=<value>}
in the configuration file. Only the default value can be modified this way.
Control Plane protection options¶
mode¶
Parameters
- --rx-cp-filter-mode=[all|phys|virt|excp]:[none|software-filter|hardware-filter|dedicated-queue]¶
- --tx-cp-filter-mode=[all|phys|virt|excp]:[none|software-filter]¶
Optional. Choose control plane protection mode on RX or TX for each device type.
The different types of device that can be configured are
phys
(i.e. any physical network device like Intel or Mellanox NICS),virt
(i.e. any virtual network device like vhost-user ports),excp
(i.e. any device in charge to send packets in exception like fpvi using vhost).For each device type, the following values are possible:
phys
:hardware-filter
ordedicated-queue
(only with Mellanox NICs, fallback insoftware-filter
for other NICs) orsoftware-filter
or none (default value isnone
)virt
:software-filter
ornone
(default value isnone
)excp
:software-filter
ornone
in both RX and TX (default value isnone
in RX andsoftware-filter
in TX)
Example
###########################
##### FPN-SDK OPTIONS #####
###########################
${FPNSDK_OPTIONS:=--rx-cp-filter-mode=phys:hardware-filter,virt:software-filter --tx-cp-filter-mode=all:none}
threshold¶
Parameters
- --rx-cp-filter-threshold=[all|phys|virt|excp]:[threshold number][%]¶
- --tx-cp-filter-threshold=[all|phys|virt|excp]:[threshold number][%]¶
Optional. Configure control plane protection threshold on RX or TX for software-filter or hardware-filter mode. The value can be provided with a fixed value or a percentage by using the % keyword. By default, control plane protection threshold is 50% in RX and TX.
Example
###########################
##### FPN-SDK OPTIONS #####
###########################
${FPNSDK_OPTIONS:=--rx-cp-filter-threshold=phys:2048,virt:10% --tx-cp-filter-threshold=all:50%}
Managing RETA entries¶
RETAs are per-port configurable tables used by the NIC controllers’ RSS filtering feature to select the RX queue into which to store an IP input packet. When receiving an IPv4 or an IPv6 packet, the controller computes a 32-bit hash based on:
the source address and the destination address in the packet’s IP header,
the source port and the destination port in the UDP/TCP header, if any.
The controller then uses the RSS hash value to compute a RETA table index to get the number of the RX queue where to store the packet.
The DPDK API includes a function that is exported by PMDs to configure a port’s RETA entries.
Note
The number or RETA entries depends on your NIC:
NIC
RETA entries
Intel 1GbE
128
Intel 10GbE
128
Intel 40GbE
512
For test purposes, the
testpmd
application includes the following command to configure RETA entries:port config X rss reta (reta_index,queue_number)[,(hash_index,queue_number)]
- X
Port for which to configure RETA entries.
- hash_index
Index of a RETA entry.
- hash_index
RX queue number to be stored in the RETA entry.
Configure a port’s RETA entries:
dpdk-rss-reta-set <Pi> index <i>[ <j>[ ...]] queue <Qj>
- <Pi>
Port number
- index <i>[ <j>[ …]]
Space-separated or tab-separated list of RETA entries.
- queue <Qj>
RX queue index to write in RETA entries.
Display a port’s RETA entries:
dpdk-rss-reta <Pi> [index <i> [<j>]]
- <Pi>
Port number.
- index <i>
Optional. RETA entry index.
- index <i> [<j>]
Optional. RETA entries range indices. By default, all RETA entries are displayed.
Select the method used to compute the IP input packets RSS hash value:
dpdk-rss-hash-func-set <Pi> [<HF>[ <HF>[ ...]]] HF := { ipv4|ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other| ipv6|ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other| l2-payload|ipv6-ex|ipv6-tcp-ex|ipv6-udp-ex| port|vxlan|geneve|nvgre }
- <Pi>
Port number.
- [<HF>[ <HF>[ …]]]
Space-separated or tab-separated list of RSS hash functions among the following:
ipv4
,ipv4-frag
,ipv4-tcp
,ipv4-udp
,ipv4-sctp
,ipv4-other
,ipv6
,ipv6-frag
,ipv6-tcp
,ipv6-udp
,ipv6-sctp
,ipv6-other
,l2-payload
,ipv6-ex
,ipv6-tcp-ex
,ipv6-udp-ex
,port
,vxlan
,geneve
,nvgre
. Disable RSS filtering is no hash function is supplied.
Display the set of methods currently used to compute the IP input packets RSS hash value:
dpdk-rss-hash-func <Pi>
- <Pi>
Port number.
Set the 40-byte RSS hash key used to compute the IP input packets RSS hash value:
dpdk-rss-hash-key-set <Pi> <key>
- <Pi>
Port number.
- <key>
40-bytes key as a contiguous set of 80 hexadecimal digits (2 hexadecimal digits per byte).
Display the current 40-byte RSS hash key used to compute the IP input packets RSS hash value:
dpdk-rss-hash-key <Pi>
- <Pi>
Port number.