Usage

Tools

fp-cpu-usage

Description

Display the number of percents of cpu usage spent to process packets.

Packets can come from NIC or from intercore (mainly due to offload of cryptographic operations). Average cycles/packet for these two kinds of packets is provided.

Synopsis

# fp-cpu-usage [-q|--quiet] [-d|--delay] [-j|--json] [-h|--help]

Parameters

-q, --quiet

Display fast path logical cores usage in quiet mode.

-d, --delay

Duration of CPUs usage polling in microsecond.

-j, --json

Display informations in JSON format.

-h, --help

Display help.

Example

# fp-cpu-usage
Fast path CPU usage:
cpu: %busy     cycles   cycles/packet  cycles/ic pkt
  2:   99%  697179716             829              0
  4:   51%  363169408               0           1729
  6:   54%  383451844               0           1825
 16:   51%  362776960               0           1727
 18:   54%  382313120               0           1820
average cycles/packets received from NIC: 2626 (2206989016/840180)
# fp-cpu-usage -q -d 100000
Fast path CPU usage (quiet):
cpu: status
 19: alive
 20: alive
 27: alive
 34: alive

 100% CPUs alive
# fp-cpu-usage -j
{
  "average_cycles_per_packet": 2626,
  "total_cycles": 2206989016,
  "cpus": [
    {
      "busy": 99,
      "cpu": 2,
      "cycles": 697179716,
      "cycles_per_packet": 829,
      "cycles_per_ic_pkt": 0
    },
    {
      "busy": 51,
      "cpu": 4,
      "cycles": 363169408,
      "cycles_per_packet": 0,
      "cycles_per_ic_pkt": 1729
    },
    {
      "busy": 54,
      "cpu": 6,
      "cycles": 383451844,
      "cycles_per_packet": 0,
      "cycles_per_ic_pkt": 1825
    },
    {
      "busy": 51,
      "cpu": 16,
      "cycles": 362776960,
      "cycles_per_packet": 0,
      "cycles_per_ic_pkt": 1727
    },
    {
      "busy": 54,
      "cpu": 18,
      "cycles": 382313120,
      "cycles_per_packet": 0,
      "cycles_per_ic_pkt": 1820
    }
  ],
  "total_packets": 840180
}
# fp-cpu-usage -h
Fastpath CPUs usage:
fp-cpu-usage [-q|--quiet] [-d|--delay] [-h|--help]
  -q, --quiet               Display fastpath CPUs usage in quiet mode.
  -d, --delay               Duration of CPUs usage dump in microsecond
                            (default 200000us)
  -j, --json                Display fast path logical cores usage in json format.
  -h, --help                Display this help.

fp-shmem-ports

Description

Display and configure the parameters of detected ports at FPN-SDK level.

Synopsis

# fp-shmem-ports <action> <options>

Parameters

-d, --dump

Display FPN-SDK port information. The dump contains the following information:

  • The core frequency

  • The TX offload feature status

  • The list of UDP ports considered as vxlan ports by reassembly features

  • One block per managed port, displaying the following information:

    port <port_number>: <port_name> numa <port_numa> bus_info <bus> mac <port_mac> driver <pmd_driver> GRO <timeout>us
       speed <speed> duplex half|full autoneg on|off rx_pause on|off tx_pause on|off autoneg_pause on|off
       <qdir> queues: <n> (max: <m>)
       <feature> on|off
    
    • <port_number>

      Port number

    • <port_name>

      Port name.

    • <port_numa>

      Numa of the port (set to ‘no numa’ for architecture with no numa or numa independent pci bus).

    • <bus>

      The bus information for this port (typically, pci address).

    • <port_mac>

      Port’s MAC address.

    • <pmd_driver>

      Driver thats manages the port in the fast path.

    • <timeout>

      GRO timeout in us.

    • <speed>

      The link speed in Mb/s.

    • <qdir>

      RX or TX.

    • <n>, <m>

      Number and maximum number of RX or TX queues.

    • <feature>

      Supported offload feature in:

      • RX vlan strip

      • RX IPv4 checksum

      • RX TCP checksum

      • RX UDP checksum

      • GRO

      • LRO

      • TX vlan insert

      • TX IPv4 checksum

      • TX TCP checksum

      • TX UDP checksum

      • TX SCTP checksum

      • TSO

-g <timeout>, --gro-timeout=<timeout>

Set software GRO timeout. timeout is the maximum lapse of time between two coalesced packets. In TCP reassembly, ack only packet timeout is not reloaded to timeout each time an ack is received. This timeout designates instead the maximum lapse of time during which ack only packets are coalesced. A timeout of 10 microseconds gives good reassembly results on 10 Gb links. To be effective, this option must be combined with -K gro on.

-e <eth_port>|all|enabled|disabled, --eth_port=<eth_port>|all|enabled|disabled

Select a given FPN-SDK port. all means all ports, enabled means all enabled ports, and disabled means all disabled ports.

--driver <driver_name>

Select FPN-SDK port using a specific driver.

-K, --features, --offload <feature> on|off
Set or unset offload feature. Supported features:
  • rx: rx checksum offloads

  • tx: tx checksum offloads

  • tso: TCP segmentation offload

  • gro: Generic receive offload

  • lro: TCP large receive offload

  • mpls-ip: GRO reassembly of MPLS IP flows that do not follow RFC3032.

-k, --show-features, --show-offload

Display offload features status

Examples

  • Display FPN-SDK port information:

    # fp-shmem-ports --dump
    core freq : 2693482113
    offload : enabled
    vxlan ports :
       port 4789 (set by user)
       port 8472 (set by user)
    port 0: ens1f0-vrf0 numa 0 bus_info 0000:00:03.0 mac 90:e2:00:12:34:56 driver rte_ixgbe_pmd GRO timeout 10us
       RX queues: 2 (max: 128)
       TX queues: 2 (max: 64)
       RX vlan strip off
       RX IPv4 checksum on
       RX TCP checksum on
       RX UDP checksum on
       GRO on
       LRO off
       TX vlan insert on
       TX IPv4 checksum on
       TX TCP checksum on
       TX UDP checksum on
       TX SCTP checksum on
       TSO on
    
  • Enable Generic Receive Offload on all enabled ports (reassembly timeout of 10 us):

    # fp-shmem-ports --gro-timeout=10 --eth_port=enabled --offload gro on
    

fp-shmem-ready

Description

Display the name of the shared memory if it is ready for mapping, or Not found if it is not available.

The tool can be used in a script as a sentinel to synchronize multiple applications, because the process of adding a new very large shared memory instance may take a long while.

Synopsis

# fp-shmem-ready

Example

# fp-shmem-ready fp-shared
fp-shared
# fp-shmem-ready unknown-name
Not found

fp-track-dump

Description

Display the per core history of function names recorded in your application by the FPN_RECORD_TRACK() macro. Can help detect infinite loops.

Synopsis

# fp-track-dump

Example

myfunction()
  while () {
    FPN_RECORD_TRACK();
    ...
}

fp-track-dump
Core 1
       [23] PC=0x4ec59b RA=0x4e793e Func=myfunction:133 cycles=5383286
       [22] PC=0x4ec341 RA=0x4e793e Func=myfunction:133 cycles=1430467202
       [21] PC=0x4ec59b RA=0x4e793e Func=myfunction:133 cycles=5381148
       [20] PC=0x4ec341 RA=0x4e793e Func=myfunction:133 cycles=715474104
...
Core 2
       [31] PC=0x4ec59b RA=0x4e793e Func=myfunction:133 cycles=5383286
       [30] PC=0x4ec341 RA=0x4e793e Func=myfunction:133 cycles=1430467202
...

fp-intercore-stats

Description

Display the state of intercore structures.

By default, only cores belonging to the intercore mask are displayed. To display all cores, use the --all parameter.

fp-intercore-stats can also display the number of cycles spent on packets that went through the pipeline.

Synopsis

# fp-intercore-stats

Examples

# fp-intercore-stats
Intercore information
      mask 0x4004
Core 2
ring <fpn_intercore_2>
  size=512
  ct=0
  ch=0
  pt=0
  ph=0
  used=0
  avail=511
  watermark=0
  bulk_default=1
  no statistics available
Core 14
ring <fpn_intercore_14>
  size=512
  ct=0
  ch=0
  pt=0
  ph=0
  used=0
  avail=511
  watermark=0
  bulk_default=1
  no statistics available
# fp-intercore-stats --all
Intercore information
     mask 0x4004
Core 0 (NOT IN MASK)
ring <fpn_intercore_0>
 size=512
 ct=0
 ch=0
 pt=0
 ph=0
 used=0
 avail=511
 watermark=0
 bulk_default=1
 no statistics available
Core 1 (NOT IN MASK)
ring <fpn_intercore_1>
 size=512
 ct=0
 ch=0
 pt=0
 ph=0
 used=0
 avail=511
 watermark=0
 bulk_default=1
 no statistics available
Core 2
ring <fpn_intercore_2>
 size=512
 ct=0
 ch=0
 pt=0
 ph=0
 used=0
 avail=511
 watermark=0
 bulk_default=1
 no statistics available
...
# fp-intercore-stats --cpu
Fast path CPU usage:
cpu: %busy     cycles   cycles/pkt  cycles/ic pkt
  2:   99%  697179716          829              0
  4:   51%  363169408            0           1729
  6:   54%  383451844            0           1825
  8:   <1%    6180544            0              0
 14:   <1%    5683196            0              0
 16:   51%  362776960            0           1727
 18:   54%  382313120            0           1820
 20:   <1%    6234228            0              0
average cycles/packets received from NIC: 2626 (2206989016/840180)
ic pkt: packets that went intercore

Port control

Statistics

You can display FPN-SDK usual ports statistics via fp-shmem-ports -s.

The extended port statistics are displayed via fp-cli dpdk-port-stats <portid>

The following statistics are available on all interfaces managed by the fast path:

fpn.rxqX_pkts

Number of packets received on the driver Rx queue.

fpn.rxqX_bulks

Number of packet bulks received on the driver Rx queue.

fpn.rxqX_bulks_full

Number of full packet bulks received on this Rx queue. It happens when the hardware Rx queue has enough packets when the driver Rx function is invoked. Therefore, this counter can give an indication about the load of the core.

fpn.rxqX_bulks_qthres

Number of packet bulks received on this Rx queue while the hardware queue length is above a threshold. This counter gives an indication about the load of the core, and about the latency of packet reception. This statistic is disabled by default because it slightly impacts performance. It has to be enabled in fp-cli with dpdk-qthres-stats-set. It is not available on all drivers.

fpn.txqX_pkts

Number of packets successfully transmitted on the driver Tx queue.

fpn.txqX_bulks

Number of packet bulks transmitted on the driver Tx queue.

fpn.txqX_bulks_qthres

Number of packet bulks transmitted on this Tx queue while the hardware queue length is above a threshold. This counter gives an indication about the load of the network link, and about the latency of packet transmission. This statistic is disabled by default because it slightly impacts performance. It has to be enabled in fp-cli with dpdk-qthres-stats-set. It is not available on all drivers.

fpn.txqX_queue_full

This counter is incremented for each packet that is dropped because the Tx function of the driver cannot transmit them. This usually means that the hardware Tx queue is full.

fpn.txqX_queue_disabled

This counter is incremented for each packet transmitted on a disabled Tx queue.

fpn.txqX_offload_failed

This counter is incremented for each packet that fails offloads transmission to hardware.

fpn.rx_cp_passthrough

When Control Plane Protection is enabled, this statistic is incremented for each packet received when machine is not overloaded. These packets are processed normally.

fpn.rx_cp_kept

When the Rx ring filling reaches a threshold, packets are inspected by the Control Plane Protection mechanism. This statistic is incremented for packets recognized as CP packets, which are kept and processed by the stack.

fpn.rx_dp_drop

When the Rx ring filling reaches a threshold, packets are inspected by the Control Plane Protection mechanism. This statistic is incremented for packets recognized as DP packets, which are dropped.

fpn.rx_cp_overrun

When CPU consumption used by Control Plane Protection exceeds a threshold, this system is disabled. This statistic is incremented for each Rx packet not analyzed by Control Plane Protection due to CPU overload.

fpn.tx_cp_passthrough

When Control Plane Protection is enabled, this statistic is incremented for each packet transmitted on a link that is not overloaded. These packets are sent normally.

fpn.tx_cp_kept

When the Tx ring filling reaches a threshold, packets are inspected by the Control Plane Protection mechanism. This statistic is incremented for packets recognized as CP packets, which are sent on the wire.

fpn.tx_dp_drop

When the Tx ring filling reaches a threshold, packets are inspected by the Control Plane Protection mechanism. This statistic is incremented for packets recognized as DP packets, which are dropped.

fpn.tx_cp_overrun

When CPU consumption used by Control Plane Protection exceeds a threshold, this system is disabled. This statistic is incremented for each Tx packet not analyzed by Control Plane Protection due to CPU overload.

The following statistics are available on interfaces where a software queue at FPN-SDK level is defined:

fpn.tx_used_squeue

Number of tx packets that have been set in the software queue due to a full ring on the NIC.

The following additional statistics are available on interfaces were GRO is enabled:

gro.in: Number of packets entering GRO module
gro.out: Number of packets exiting GRO module
gro.done: Number of packets merged by GRO module
gro.per_reass: Mean number of packets per GRO reassembly
gro.ctx_timeout: Number of GRO contexts flushed by timeout
gro.ctx_flush: Number of GRO contexts flushed before timeout
gro.ctx_curr: Current number of GRO contexts in processing

Control the i40e LLDP agent

Description

Following commands are supported to control i40e LLDP agent:

  • start/stop of the LLDP agent.

  • get local/remote of the LLDP MIB (Management Information Base).

I40e LLDP agent can be controlled via fp-cli command.

Examples

# fpcmd dpdk-i40e-debug-lldp-cmd eth1 lldp stop
# fpcmd dpdk-i40e-debug-lldp-cmd eth1 lldp start
# fpcmd dpdk-i40e-debug-lldp-cmd eth1 lldp get local
LLDP MIB (local)
0000   01 80 c2 00 00 0e 68 05 ca 38 6d e8 88 cc 02 07  ......h..8m.....
0010   04 68 05 ca 38 6d e8 04 07 03 68 05 ca 38 6d e8  .h..8m....h..8m.
...
# fpcmd dpdk-i40e-debug-lldp-cmd eth1 lldp get remote