4.2.2. Networking Issues

Ports synchronization problems

Symptoms
  • No ports are displayed when calling fp-cli iface.

Hints

  • If you are dealing with physical NIC: Check that your NIC is detected by Linux, using lspci. See lspci section for details.

  • Check the output from fast-path.sh config --display and make sure your NIC is among the selected ethernet cards.

No packets are forwarded

Symptoms
  • No packets are forwarded.

  • fp-cli stats non-zero shows no (or low) IpForwDatagrams stats.

  • fp-cli dpdk-port-stats <port> shows no (or low) rx/tx packets stats.

  • ip -s link show <interface> shows no (or low) rx/tx packets stats.

  • kill -USR1 $(pidof fp-rte) (Intel and Arm only) shows no (or low) rx/tx packets stats.

Hints
  • Check whether configurations between Linux and the fast path are consistent:

    • Check IP addresses and routes configured in the kernel, using ip address show and ip route show. Check whether the interfaces and bridges are up and running using ip link show and brctl show <bridge_name>.

  • Check IP addresses and routes known to the fast path, using fp-cli route4 type all.

  • If you are using bridges, check whether your bridges have correct states, using fp-cli bridge.

  • Check that fp_dropped fast path statistics are not too high using fp-cli stats percore non-zero. A high fp_dropped stat suggests that packets are somehow not acceptable for the fast path. The ideal case is when forwarding stats are evenly spread throughout cores, that is when each core more or less forwards as many packets as the others. See Fast Path statistics section for an example of stats.

  • Check that exception stats fast path statistics are not too high. Basic exceptions indicate how many packets could not be processed by the fast path, and have thus been injected in the linux stack for slow path processing. If the value is high, it is a good indicator that IP addresses/routes/tunnels in the fast path are badly configured. See Fast Path statistics section for an example of stats.

  • Check whether it works correctly when the fast path is turned off. See Turn Fast Path off section for details.

Netfilter synchronization problems

Symptoms
  • Packets are not filtered according to your iptables rules.

Hints
  • Check whether filtering rules between Linux and fast path are consistent:

    • Check filtering rules in the kernel, using ip[6]tables -S. Refer to the ip[6]tables manpage for details on this command.

    • Check filtering rules known to the fast path, using fp-cli nf[4|6]-table <filter|mangle|nat> [all|nonzero]. Check also whether the filtering module is enabled, using fp-cli nf[4|6]. Some targets and rules are not supported in the fast path: check that you are using only documented supported options.

Connectivity problems

Symptoms
  • I can no longer connect (via the network) to my machine.

  • The VM was configured to redirect connections to the guest (using something like -netdev user,id=user.0,hostfwd=tcp::35047-:22).

Hints
  • When starting Virtual Service Router, NIC kernel drivers have been unloaded and thus all IP configuration lost.

Network configuration lost after restart

Symptoms
  • My network configuration no longer works after reboot or Virtual Service Router restart. For example:

    1. My linux bridge is empty after stopping (or restarting) the fast path:

      # brctl show
      bridge name     bridge id               STP enabled     interfaces
      
Hints
  • The fast path may replace, change, delete and create netdevices. Any tool (brctl, iproute2, etc.) that use “old” references to netdevices must have its configuration refreshed when the fast path is stopped.

    • For linux bridge, recreate the bridge and re-add the ports if need be. e.g.:

      # brctl addbr br0
      # brctl addif br0 eth1
      # brctl addif br0 tap0
      

DKMS takes too long

Symptoms
  • Modules recompilation/removal with DKMS takes too long.

Hints
  • Edit the DKMS configuration in /etc/dkms/framework.conf, to prevent it from running some long operations:

    # mkdir -p /etc/dkms
    # echo 'no_initrd="y"' >> /etc/dkms/framework.conf
    # echo 'no_depmod="y"' >> /etc/dkms/framework.conf
    
  • Disable weak-modules:

    # chmod a-x /sbin/weak-modules
    

VRRP is unable to work on VMware virtual machines

Symptoms
  • VRRP reports master state on all members but no member receives packets intended for the VRRP virtual IP

Cause
The VMware VSwitch drops frames to MAC addresses that are unknown from the network card properties of the VM.
  • VRRP gives the ability to define a virtual IP that can move between machines. By design, the virtual IP is associated to a virtual MAC address - different from the real network card’s MAC address. Using a virtual MAC address instead of a real MAC address makes the switchover quicker as no update of ARP tables is needed. However, a Virtual MAC address is not supported by the VMware VSwitch. Unlike physical switches, the VSwitch has no MAC learning mechanism capability. The network section of the VM properties defines virtual network cards and one MAC address per card. The VSwitch only knows those addresses to determine on what port to send a frame. Frames to any unknown addresses, including virtual VRRP MAC addresses, are dropped.

  • VRRP uses multicast packets to send VRRP protocol messages between its members. Multicast packets use typical multicast MAC addresses that are also not known by the VSwitch.

Hints
One of the following solutions should be applied:
  • Warning: this solution may impact the performance on VMware hypervisor. Enable promiscuous mode on all VSwitches associated with the VLAN you want VRRP to run on. Basically, the VM will receive all traffic within the VSwitch VLAN. Refer to https://kb.vmware.com/s/article/1002934 for more information.

  • Set up the VRRP instance to disable the usage of a virtual MAC address “vmac” and to use manual unicast peers to exchange VRRP protocol data unit instead of using multicast. This solution is only applicable to VMware and should not be applied on any other context without an explicit request from support. In this mode, the virtual IP address is associated to the real NIC MAC address of the active member. Upon member switchover, a gratuitous ARP is sent to advertise other machines to update their ARP table with the new MAC. You must ensure and test that gratuitous ARP are treated correctly by all machines. If not, some machines would lose connectivity until the ARP cache timeout expires.

Conflict between i40e FW-LLDP and software LLDP agents

Symptoms
  • LLDPDU are not received while the source correctly sends them and the link between both machine works.

Cause
LLDPDU may be consumed by the LLDP engine integrated in the network card firmware:
  • Some Intel network adapter (like Ethernet 700 series) has built-in hardware LLDP engine, which is enabled by default. The LLDP Engine is responsible for receiving and consuming LLDP frames, and also replies to the LLDP frames that it receives. The LLDP engine does not forward LLDP frames to the network stack of the Operating System. The i40e driver enable this feature by default.

Hints
The firmware LLDP must be disabled in i40e ports:
  • For fast path ports:

    # fp-cli dpdk-i40e-debug-lldp-cmd on|off
    
  • For Linux ports: To disable the FW-LLDP:

    # ethtool --set-priv-flags <ifname> disable-fw-lldp off
    

    To check the FW-LLDP setting:

    # ethtool --show-priv-flags <ethX>
    

    See also

    The FW-LLDP (Firmware Link Layer Discovery Protocol) chapter of the i40e Linux Base Driver for Intel controller.