Usage¶
FPN-SDK Add-on for DPDK automatically starts when you start the fast path with the following script:
# fast-path.sh start
See also
For more information on how to start the fast path, see the Fast Path Baseline documentation.
Providing options¶
The FPN-SDK can take several arguments on the fast path command line.
Most of the usual options are automatically added by the fast path start script
while parsing the fast-path.env
configuration file.
The configuration variables EAL_OPTIONS
(for DPDK EAL options) and
FPNSDK_OPTIONS
(for FPN-SDK options) of the fast path configuration file may
contain additional options that are described here.
Useful DPDK EAL options¶
Here are the most useful EAL options to set in the EAL_OPTIONS
variable:
-
--log-level
=<LOGLEVEL>
¶ Available since dpdk-1.8.0. Optional. Set the dpdk log level. When set to n, all dpdk drivers messages with a log level equal or below n are printed. LOGLEVEL can have one of the following values:
- 1:
EMERG
System is unusable. - 2:
ALERT
Action must be taken immediately. - 3:
CRIT
Critical conditions. - 4:
ERR
Error conditions. - 5:
WARNING
Warning conditions. - 6:
NOTICE
Normal but significant condition. - 7:
INFO
Informational. - 8:
DEBUG
Debug-level messages.
- 1:
Any other EAL option can be passed by modifying the EAL_OPTIONS
variable.
See also
For the full list of options, see the DPDK documentation.
FPN-SDK options¶
Here are the least common FPN-SDK options to set in the FPNSDK_OPTIONS
variable:
Logmode option¶
Parameters
-
--logmode
=[console|syslog]
¶ Optional. Specify where the log of the fast path should be display: console or syslog. By default, the log are displayed with syslog.
Vlan stripping option¶
Parameters
-
--vlan-strip
¶
Optional. Enable the VLAN header stripping feature on incoming frames if supported by the hardware. By default, vlan stripping feature is disabled.
Alternately, you can specify
: ${VLAN_STRIP:=on}
in the configuration file.
Port options¶
Parameters
-
--rxq-per-port
=[queues per port]
¶ Optional. Specify the number of queues per port. Useful when using automatic mapping of cores to ports. By default, the maximum of queues is allocated to a port. Cannot be used together with -t.
Core / Port binding¶
-
-t
[<core number>=<port number>/<core number>=<port number>:<port number>]
¶ - core number
- Fast path logical core number preceded by
c
. - port number
- Number of port to poll. For each occurrence, one RX queue is created.
Specify how many RX queues logical cores poll on ports. Unspecified logical cores and ports are idle.
It is preferred to set the core / port binding in the fast path configuration file using the option
: ${CORE_PORT_MAPPING:=<value>}
.If this parameter is not set, each core polls all the ports on the same numa socket. If there is not enough RX queues, only a subset of these cores will poll the port. If the maximal number of TX queues is smaller than the number of fast path cores, the port is configured with only one shared TX queue. The total number of queues per port must not exceed the NIC’s hardware limit, else only shared queue will be used.
The hardware is responsible for distributing flows over queues (for instance, via RSS on Intel or Arm platforms).
This parameter overrides the
--rxq-per-port
options described above.
Offloads¶
Description
To support offload features such as TSO or L4 checksum offloading to the NIC,
or forwarding offload information from a guest to the NIC through a virtual
interface, you must enable offloading in the fast path. You can then tune the offload
features more precisely using ethtool -K <iface> <feature> on|off
.
-
--offload
¶
Enable the offload feature in the fast path.
Example
###########################
##### FPN-SDK OPTIONS #####
###########################
:${FPNSDK_OPTIONS:=--offload}
TX scatter¶
Parameters
-
--tx-scatter
¶
Optional. Enable the TX scatter feature if it is supported by the hardware. By default, TX scatter is disabled. If offload feature is enabled, TX scatter feature is enabled too.
Cryptography options¶
Description
You can tune the maximum available cryptographic sessions and the number of buffers allocated for cryptography.
-
--crypto-max-sessions
¶
Tune maximum available cryptographic sessions. Cryptography code will reserve space to store sessions in pools. If we are doing IPsec, we need one session per sa.
-
--crypto-buffers
¶
Tune number of buffers allocated for cryptography. Crypto library will allocate buffers from pools to store per buffer crypto parameters. This parameter is the size of the buffer pool.
Example
###########################
##### FPN-SDK OPTIONS #####
###########################
:${FPNSDK_OPTIONS:= --crypto-max-sessions=8192 --crypto-buffers=32768}
Cryptographic offloading mask¶
Description
Specify which fast path cores are able to do cryptographic operations for other cores. This cryptographic offloading is done only between core on the same numa node. By default crypto offloading is done only for decrypt operations and to cores that are not receiving traffic at this moment.
-
-i
<crypto_offloading mask>
¶ Specify the fast path cores available for crypto offloading. By default, all cores specified in
fp_mask
are used. The format of the mask can be a list of cpus or a mask starting with 0x. Two specific strings are also available: none to disable the crypto offloading feature and all (the default value). It is preferred to set the cryptographic offloading mask in the fast path configuration file using the option : ${CRYPTO_OFFLOAD_MASK:=<value>}.
Example
: ${CRYPTO_OFFLOAD_MASK:=none}
Statistics of offloaded cryptographic operations can be retrieved with:
crypto-offload-stats
It is possible to enable/disable the cryptographic offload for packets encryption. Be aware that enabling encyption offload for an IPsec tunnel that aggregates many flows managed by several fast path cores can cause IPsec errors due to the anti-replay window.
crypto-offload-encrypt-set on|off
It is possible to disable the cryptographic offload for small packets by setting the threshold of minimal size of data
crypto-offload-threshold-set <min>
- <min>
Minimal size of data to offload cryptographic operations. By default any packet can be offloaded (size set to 0)
To avoid to send cryptographic offloading to core that receives high traffic, it is possible to disable the cryptographic offload to overloaded cores during a configurable time period.
crypto-offload-timer-set <time>
- <time>
Time, in microseconds, during an overloaded core is ignored for cryptographic offloading. Default value is 10 milliseconds
Intercore options¶
Description
You can tune the size of the intercore rings, used for pipeline implementation.
-
--intercore-ring-size
¶
One ring is allocated for each core, to store messages sent from other cores. Use this option to change the size of the rings. Default value is
CONFIG_MCORE_INTERCORE_RING_SIZE
, as specified in/etc/6WINDGate/fpnsdk.config
.Alternately, you can specify
: ${INTERCORE_RING_SIZE:=<value>}
in the configuration file.
Intercore implementation¶
The following functions have been added to the DPDK mbuf
api:
m_set_process_fct()
m_call_process_fct()
These functions use mbuf
headroom to store a fpn_callback
structure.
To avoid updating mbuf
internal pointers, m_prepend/m_adj
functions are not
used, yet the checks on available lengths in headroom are the same.
Since we don’t call m_prepend/m_adj
, once m_set_process_fct()
has been
called, no operation can be done on mbuf
until m_call_process_fct()
is
called.
Software scheduling implementation¶
The software scheduling API is implemented on top of the DPDK
librte_sched
library.
Linux / fast path communication¶
Parameters
-
-l
<exception mask>
¶ Specify the fast path cores involved in polling packets coming from Linux over DPVI interfaces. By default, all cores specified in
fp_mask
are used. All packets locally sent by the Linux stack are forwarded to the fast path and processed by the selected cores.
-
-e
<DPVI_MASK>
¶ Specify the control plane cores selected to process exception packets issued by the fast path. By default, one core is reserved in Linux to process the exception packets. The fast path relies on RSS or on the flow director tag value to select the DPVI core.
Alternately, you can specify
: ${DPVI_MASK:=<value>}
in the configuration file.
RX/TX descriptors and thresholds¶
Parameters
-
--nb-mbuf
=<total mbuf number>|<sock0-mbuf-number,sock1-mbufs-number,...>
¶ Specify the number of
mbufs
to add in the pool (default is 16384).It is preferred to set the number of mbufs in the fast path configuration file using the variable
: ${NB_MBUF:=<value>}
, as it allows to set the value toauto
which lets the wizard calculate how many mbufs are required. Indeed, the number of needed mbufs can not be easily computed manually as some features like TCP use additional mbufs.Optimal performance is reached when there are as few mbufs as possible. However, mbuf allocation failure can lead to unexpected behavior.
-
--nb-rxd
=[RX descriptor number]
¶ Optional. Specify the number of RX descriptors allocated to the NIC. It is highly recommended to use a power of 2 to be compliant with any PMD. The minimal value is 64. Default is 128.
Important
For Intel Ethernet Controller XL710, specify 1024 RX descriptors.
-
--nb-txd
=[TX descriptor number]
¶ Optional. Specify the number of TX descriptors allocated to the NIC. It is highly recommended to use a power of 2 to be compliant with any PMD. The minimal value is 64. Default is 512.
-
--soft-queue
=[<port number>=<additional TX desc number>/default=<additional TX desc number>]
¶ Optional. Add a software queue at FPN-SDK level. This is particularly useful for NICs where it is not possible to configure the number of Tx descriptors. For performance purpose this field must be a power of 2. The maximal value is 32768. Default is 0 (i.e. no software queue).
- port number
- Fast path port number preceded by p or default to apply the configuration to all ports
- additional TX desc number
- Number of additional TX descriptors allocated at FPN-SDK level.
Alternately, you can specify
: ${SOFT_TXQ_SIZE:=<value>}
in the configuration file. Only the default value can be modified this way.
Managing RETA entries¶
RETAs are per-port configurable tables used by the NIC controllers’ RSS filtering feature to select the RX queue into which to store an IP input packet. When receiving an IPv4 or an IPv6 packet, the controller computes a 32-bit hash based on:
- the source address and the destination address in the packet’s IP header,
- the source port and the destination port in the UDP/TCP header, if any.
The controller then uses the RSS hash value to compute a RETA table index to get the number of the RX queue where to store the packet.
The DPDK API includes a function that is exported by PMDs to configure a port’s RETA entries.
Note
The number or RETA entries depends on your NIC:
NIC RETA entries Intel 1GbE 128 Intel 10GbE 128 Intel 40GbE 512 For test purposes, the
testpmd
application includes the following command to configure RETA entries:port config X rss reta (reta_index,queue_number)[,(hash_index,queue_number)]
- X
Port for which to configure RETA entries.
- hash_index
Index of a RETA entry.
- hash_index
RX queue number to be stored in the RETA entry.
Configure a port’s RETA entries:
dpdk-rss-reta-set <Pi> index <i>[ <j>[ ...]] queue <Qj>
- <Pi>
Port number
- index <i>[ <j>[ …]]
Space-separated or tab-separated list of RETA entries.
- queue <Qj>
RX queue index to write in RETA entries.
Display a port’s RETA entries:
dpdk-rss-reta <Pi> [index <i> [<j>]]
- <Pi>
Port number.
- index <i>
Optional. RETA entry index.
- index <i> [<j>]
Optional. RETA entries range indices. By default, all RETA entries are displayed.
Select the method used to compute the IP input packets RSS hash value:
dpdk-rss-hash-func-set <Pi> [<HF>[ <HF>[ ...]]] HF := { ipv4|ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other| ipv6|ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other| l2-payload|ipv6-ex|ipv6-tcp-ex|ipv6-udp-ex| port|vxlan|geneve|nvgre }
- <Pi>
Port number.
- [<HF>[ <HF>[ …]]]
Space-separated or tab-separated list of RSS hash functions among the following:
ipv4
,ipv4-frag
,ipv4-tcp
,ipv4-udp
,ipv4-sctp
,ipv4-other
,ipv6
,ipv6-frag
,ipv6-tcp
,ipv6-udp
,ipv6-sctp
,ipv6-other
,l2-payload
,ipv6-ex
,ipv6-tcp-ex
,ipv6-udp-ex
,port
,vxlan
,geneve
,nvgre
. Disable RSS filtering is no hash function is supplied.
Display the set of methods currently used to compute the IP input packets RSS hash value:
dpdk-rss-hash-func <Pi>
- <Pi>
Port number.
Set the 40-byte RSS hash key used to compute the IP input packets RSS hash value:
dpdk-rss-hash-key-set <Pi> <key>
- <Pi>
Port number.
- <key>
40-bytes key as a contiguous set of 80 hexadecimal digits (2 hexadecimal digits per byte).
Display the current 40-byte RSS hash key used to compute the IP input packets RSS hash value:
dpdk-rss-hash-key <Pi>
- <Pi>
Port number.