Usage¶

FPN-SDK Add-on for DPDK automatically starts when you start the fast path with the following script:

# fast-path.sh start

Providing options¶

The FPN-SDK can take several arguments on the fast path command line. Most of the usual options are automatically added by the fast path start script while parsing the fast-path.env configuration file.

The configuration variables EAL_OPTIONS (for DPDK EAL options) and FPNSDK_OPTIONS (for FPN-SDK options) of the fast path configuration file may contain additional options that are described here.

Useful DPDK EAL options¶

Here are the most useful EAL options to set in the EAL_OPTIONS variable:

--log-level=<LOGLEVEL>¶

Available since dpdk-1.8.0. Optional. Set the dpdk log level. When set to n, all dpdk drivers messages with a log level equal or below n are printed. LOGLEVEL can have one of the following values:

1: EMERG System is unusable.
2: ALERT Action must be taken immediately.
3: CRIT Critical conditions.
4: ERR Error conditions.
5: WARNING Warning conditions.
6: NOTICE Normal but significant condition.
7: INFO Informational.
8: DEBUG Debug-level messages.

Any other EAL option can be passed by modifying the EAL_OPTIONS variable.

FPN-SDK options¶

Here are the least common FPN-SDK options to set in the FPNSDK_OPTIONS variable:

Log options¶

Parameters

--logmode=[console|syslog]¶: Optional. Specify where the log of the fast path should be display: console or syslog. By default, the log are displayed with syslog.

--debug-rpc¶: If specified, enable fpn-rpc debug logs.

--no-shm-hugetlb¶: If specified, do not attempt to map shared memory in hugepages.

Vlan stripping option¶

Parameters

--vlan-strip¶

Optional. Enable the VLAN header stripping feature on incoming frames if supported by the hardware. By default, vlan stripping feature is disabled.

Alternately, you can specify : ${VLAN_STRIP:=on} in the configuration file.

Port options¶

Parameters

--nb-rxq=[all|phys|virt]:[number of rx queues]¶

Optional. Specify the number of queues per port for each device type. Useful when using automatic mapping of cores to ports. By default, the maximum of queues is allocated to a port. Cannot be used together with -t.

The different types of device that can be configured are phys (i.e. any physical network device like Intel or Mellanox NICS), virt (i.e. any virtual network device like vhost-user ports).

Core / Port binding¶

-t [<core number>=<port number>/<core number>=<port number>:<port number>]¶

core number: Fast path logical core number preceded by c.
port number: Number of port to poll. For each occurrence, one RX queue is created.

Specify how many RX queues logical cores poll on ports. Unspecified logical cores and ports are idle.

It is preferred to set the core / port binding in the fast path configuration file using the option : ${CORE_PORT_MAPPING:=<value>}.

If this parameter is not set, each core polls all the ports on the same numa socket. If there is not enough RX queues, only a subset of these cores will poll the port. If the maximal number of TX queues is smaller than the number of fast path cores, the port is configured with only one shared TX queue. The total number of queues per port must not exceed the NIC’s hardware limit, else only shared queue will be used.

The hardware is responsible for distributing flows over queues (for instance, via RSS on Intel or Arm platforms).

This parameter overrides the --nb-rxq options described above.

Offloads¶

Description

To support offload features such as TSO or L4 checksum offloading to the NIC, or forwarding offload information from a guest to the NIC through a virtual interface, you must enable offloading in the fast path. You can then tune the offload features more precisely using ethtool -K <iface> <feature> on|off.

--offload¶: Enable the offload feature in the fast path.

Example

###########################
##### FPN-SDK OPTIONS #####
###########################

:${FPNSDK_OPTIONS:=--offload}

TX scatter¶

Parameters

--tx-scatter¶: Optional. Enable the TX scatter feature if it is supported by the hardware. By default, TX scatter is disabled. If offload feature is enabled, TX scatter feature is enabled too.

Cryptography options¶

Description

You can tune the maximum available cryptographic sessions and the number of buffers allocated for cryptography.

--crypto-max-sessions¶: Tune maximum available cryptographic sessions. Cryptography code will reserve space to store sessions in pools. If we are doing IPsec, we need one session per sa.

--crypto-buffers¶: Tune number of buffers allocated for cryptography. Crypto library will allocate buffers from pools to store per buffer crypto parameters. This parameter is the size of the buffer pool.

Example

###########################
##### FPN-SDK OPTIONS #####
###########################

:${FPNSDK_OPTIONS:= --crypto-max-sessions=8192 --crypto-buffers=32768}

Cryptographic offloading mask¶

Description

Specify which fast path cores are able to do cryptographic operations for other cores. This cryptographic offloading is done only between core on the same numa node. By default crypto offloading is done only for decrypt operations and to cores that are not receiving traffic at this moment.

-i <crypto_offloading mask>¶: Specify the fast path cores available for crypto offloading. By default, all cores specified in fp_mask are used. The format of the mask can be a list of cpus or a mask starting with 0x. Two specific strings are also available: none to disable the crypto offloading feature and all (the default value). It is preferred to set the cryptographic offloading mask in the fast path configuration file using the option : ${CRYPTO_OFFLOAD_MASK:=<value>}.

Example

: ${CRYPTO_OFFLOAD_MASK:=none}

Statistics of offloaded cryptographic operations can be retrieved with:
```
crypto-offload-stats
```
It is possible to enable/disable the cryptographic offload for packets encryption. Be aware that enabling encyption offload for an IPsec tunnel that aggregates many flows managed by several fast path cores can cause IPsec errors due to the anti-replay window.
```
crypto-offload-encrypt-set on|off
```
It is possible to disable the cryptographic offload for small packets by setting the threshold of minimal size of data
crypto-offload-threshold-set <min>
<min>
Minimal size of data to offload cryptographic operations. By default any packet can be offloaded (size set to 0)
To avoid to send cryptographic offloading to core that receives high traffic, it is possible to disable the cryptographic offload to overloaded cores during a configurable time period.
crypto-offload-timer-set <time>
<time>
Time, in microseconds, during an overloaded core is ignored for cryptographic offloading. Default value is 10 milliseconds

Intercore options¶

Description

You can tune the size of the intercore rings, used for pipeline implementation.

--intercore-ring-size¶

One ring is allocated for each core, to store messages sent from other cores. Use this option to change the size of the rings. Default value is CONFIG_MCORE_INTERCORE_RING_SIZE, as specified in /etc/6WINDGate/fpnsdk.config.

Alternately, you can specify : ${INTERCORE_RING_SIZE:=<value>} in the configuration file.

Intercore implementation¶

The following functions have been added to the DPDK mbuf api:

m_set_process_fct()
m_call_process_fct()

These functions use mbuf headroom to store a fpn_callback structure.

To avoid updating mbuf internal pointers, m_prepend/m_adj functions are not used, yet the checks on available lengths in headroom are the same.

Since we don’t call m_prepend/m_adj, once m_set_process_fct() has been called, no operation can be done on mbuf until m_call_process_fct() is called.

Software scheduling implementation¶

The software scheduling API is implemented on top of the DPDK librte_sched library.

Linux / fast path communication¶

Parameters

-l <exception mask>¶: Specify the fast path cores involved in polling packets coming from Linux over fpvi interfaces. By default, all cores specified in fp_mask are used. All packets locally sent by the Linux stack are forwarded to the fast path and processed by the selected cores.

--nb-fpvi-queue=[all|phys|virt|excp]:<number of queues>¶: Specify the number of queues per fpvi port. Each port in fast path has a tuntap netdevice used to send/receive packets between fast path and linux kernel. The number of queues of the tuntap device can be configured with a different value in function of the fast path port devtype associated. The different devtypes for the fast path are: configured are phys (i.e. any physical network device like Intel or Mellanox NICS), virt (i.e. any virtual network device like vhost-user ports), excp (i.e. for fpn0).

RX/TX descriptors and thresholds¶

Parameters

--max-gro-mbufs=<max-phys-mbuf-number,max-virt-mbufs-number,>¶

Specify the maximum number of mbufs that can be stored in the GRO reassembly queues for physical and virtual ports. This is the value per port and per core. It means that the total number of mbufs required for the GRO reassembly will be (the total number of physical ports * MAX_GRO_MBUFS[1] + the total number of virtual ports * MAX_GRO_MBUFS[1]) * the total number of cores.

This parameter helps to allocate more precisely the right amount of mbufs required for the good working of the fast-path.

The value should contain 2 integers separate by a coma. The first integer is for physical ports and the seconde one is for the virtual ports.

Each integer must be between 0 and 65535.

--max-vports=<max number of dynamic vdev>¶

Specify the maximum number of vports that will be created using the fp-vdev command. Increasing this value will increase the preallocated mbuf numbers.

This value cannot exceed 300 and the total number of ports (physical / virtual) can not be more than CONFIG_MCORE_FPN_MAX_PORTS

--nb-mbuf=<total mbuf number>|<sock0-mbuf-number,sock1-mbufs-number,...>¶

Specify the number of mbufs to add in the pool (default is 16384).

It is preferred to set the number of mbufs in the fast path configuration file using the variable : ${NB_MBUF:=<value>}, as it allows to set the value to auto which lets the wizard calculate how many mbufs are required. Indeed, the number of needed mbufs can not be easily computed manually as some features like TCP use additional mbufs.

Optimal performance is reached when there are as few mbufs as possible. However, mbuf allocation failure can lead to unexpected behavior.

Control Plane protection options¶

mode¶

Parameters

--rx-cp-filter-mode=[all|phys|virt|excp]:[none|software-filter|hardware-filter|dedicated-queue]¶

--tx-cp-filter-mode=[all|phys|virt|excp]:[none|software-filter]¶

Optional. Choose control plane protection mode on RX or TX for each device type.

The different types of device that can be configured are phys (i.e. any physical network device like Intel or Mellanox NICS), virt (i.e. any virtual network device like vhost-user ports), excp (i.e. any device in charge to send packets in exception like fpvi using vhost).

For each device type, the following values are possible:

phys: hardware-filter or dedicated-queue (only with Mellanox NICs, fallback in software-filter for other NICs) or software-filter or none (default value is none)
virt: software-filter or none (default value is none)
excp : software-filter or none in both RX and TX (default value is none in RX and software-filter in TX)

Example

###########################
##### FPN-SDK OPTIONS #####
###########################

${FPNSDK_OPTIONS:=--rx-cp-filter-mode=phys:hardware-filter,virt:software-filter --tx-cp-filter-mode=all:none}

threshold¶

Parameters

--rx-cp-filter-threshold=[all|phys|virt|excp]:[threshold number][%]¶

--tx-cp-filter-threshold=[all|phys|virt|excp]:[threshold number][%]¶: Optional. Configure control plane protection threshold on RX or TX for software-filter or hardware-filter mode. The value can be provided with a fixed value or a percentage by using the % keyword. By default, control plane protection threshold is 50% in RX and TX.

Example

###########################
##### FPN-SDK OPTIONS #####
###########################

${FPNSDK_OPTIONS:=--rx-cp-filter-threshold=phys:2048,virt:10% --tx-cp-filter-threshold=all:50%}

fast path port name¶

Parameters

--netdev-name "0:port1,1:port2[,...]"¶: Override the name of the FPVI kernel netdevice that is created for each fast path port. Alternately, you can specify : ${NETDEV_NAME:='<port>:<value>[,...]'} in the configuration file.

Example

: ${NETDEV_NAME:=0:fp1,1:fp2}

Managing RETA entries¶

RETAs are per-port configurable tables used by the NIC controllers’ RSS filtering feature to select the RX queue into which to store an IP input packet. When receiving an IPv4 or an IPv6 packet, the controller computes a 32-bit hash based on:

the source address and the destination address in the packet’s IP header,
the source port and the destination port in the UDP/TCP header, if any.

The controller then uses the RSS hash value to compute a RETA table index to get the number of the RX queue where to store the packet.

The DPDK API includes a function that is exported by PMDs to configure a port’s RETA entries.

Note

The number or RETA entries depends on your NIC:

NIC

RETA entries

Intel 1GbE

128

Intel 10GbE

128

Intel 40GbE

512
For test purposes, the testpmd application includes the following command to configure RETA entries:
```
port config X rss reta (reta_index,queue_number)[,(hash_index,queue_number)]
```
X
Port for which to configure RETA entries.

hash_index
Index of a RETA entry.

hash_index
RX queue number to be stored in the RETA entry.

Configure a port’s RETA entries:
dpdk-rss-reta-set <Pi> index [ <j>[ ...]] queue <Qj>
<Pi>
Port number

index [ <j>[ …]]
Space-separated or tab-separated list of RETA entries.

queue <Qj>
RX queue index to write in RETA entries.
Display a port’s RETA entries:
dpdk-rss-reta <Pi> [index [<j>]]
<Pi>
Port number.

index 
Optional. RETA entry index.

index [<j>]
Optional. RETA entries range indices. By default, all RETA entries are displayed.
Select the method used to compute the IP input packets RSS hash value:
dpdk-rss-hash-func-set <Pi> [<HF>[ <HF>[ ...]]] HF := { ipv4|ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other| ipv6|ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other| l2-payload|ipv6-ex|ipv6-tcp-ex|ipv6-udp-ex| port|vxlan|geneve|nvgre }
<Pi>
Port number.

[<HF>[ <HF>[ …]]]
Space-separated or tab-separated list of RSS hash functions among the following: ipv4, ipv4-frag, ipv4-tcp, ipv4-udp, ipv4-sctp, ipv4-other, ipv6, ipv6-frag, ipv6-tcp, ipv6-udp, ipv6-sctp, ipv6-other, l2-payload, ipv6-ex, ipv6-tcp-ex, ipv6-udp-ex, port, vxlan, geneve, nvgre. Disable RSS filtering is no hash function is supplied.
Display the set of methods currently used to compute the IP input packets RSS hash value:
dpdk-rss-hash-func <Pi>
<Pi>
Port number.
Set the 40-byte RSS hash key used to compute the IP input packets RSS hash value:
dpdk-rss-hash-key-set <Pi> <key>
<Pi>
Port number.

<key>
40-bytes key as a contiguous set of 80 hexadecimal digits (2 hexadecimal digits per byte).
Display the current 40-byte RSS hash key used to compute the IP input packets RSS hash value:
dpdk-rss-hash-key <Pi>
<Pi>
Port number.

NIC	RETA entries
Intel 1GbE	128
Intel 10GbE	128
Intel 40GbE	512