3.2.2. Networking Issues

Ports synchronization problems

Symptoms
  • No ports are displayed when calling fp-cli iface.

Hints
  • Check the fast path and its daemons status, using fast-path.sh status. Refer to the Fast Path Baseline documentation for further details on fast-path.sh commands. You should have at least fpmd running and fp-rte (Intel and Arm only).

  • Check which interfaces are known to the cmgrd, using daemonctl cmgrd show interfaces. Refer to the Linux - Fast Path Synchronization documentation for further details on the cache manager.

  • If you are dealing with physical NIC: Check that your NIC is detected by Linux, using lspci. See lspci section for details.

  • Check the output from fast-path.sh config --display and make sure your NIC is among the selected ethernet cards.

    Refer to the Fast Path Baseline for further details on the wizard (fast-path.sh config).

  • Check the output from fast-path.sh config --dump --full, optionally with the --long option. Refer to the Fast Path Baseline for further details on the wizard (fast-path.sh config).

  • Check Linux-FP sync associated daemons status, using linux-fp-sync.sh status. Refer to the Linux - Fast Path Synchronization documentation for further details on linux-fp-sync.sh commands. You should have at least cmgrd running.

No packets are forwarded

Symptoms
  • No packets are forwarded.

  • fp-cli stats non-zero shows no (or low) IpForwDatagrams stats.

  • fp-cli dpdk-port-stats <port> shows no (or low) rx/tx packets stats.

  • ip -s link show <interface> shows no (or low) rx/tx packets stats.

  • kill -USR1 $(pidof fp-rte) (Intel and Arm only) shows no (or low) rx/tx packets stats.

Hints
  • Check whether configurations between Linux and the fast path are consistent:

    • Check Linux-FP sync associated daemons status, using linux-fp-sync.sh status. Refer to the Linux - Fast Path Synchronization documentation for further details on linux-fp-sync.sh commands.

    • Check IP addresses and routes configured in the kernel, using ip address show and ip route show. Check whether the interfaces and bridges are up and running using ip link show and brctl show <bridge_name>.

  • Check IP addresses and routes known to the fast path, using fp-cli route4 type all.

    Refer to the Fast Path Baseline documentation for further details on fp-cli commands.

  • If you are using bridges, check whether your bridges have correct states, using fp-cli bridge.

    Refer to the Fast Path Baseline documentation for further details on fp-cli commands.

  • Check that fp_dropped fast path statistics are not too high using fp-cli stats percore non-zero. A high fp_dropped stat suggests that packets are somehow not acceptable for the fast path. The ideal case is when forwarding stats are evenly spread throughout cores, that is when each core more or less forwards as many packets as the others. See Fast Path statistics section for an example of stats.

    Refer to the Fast Path Baseline documentation for further details on fp-cli commands.

  • Check that exception stats fast path statistics are not too high. Basic exceptions indicate how many packets could not be processed by the fast path, and have thus been injected in the linux stack for slow path processing. If the value is high, it is a good indicator that IP addresses/routes/tunnels in the fast path are badly configured. See Fast Path statistics section for an example of stats.

    Refer to the Fast Path Baseline documentation for further details on fp-cli commands.

  • Check whether it works correctly when the fast path is turned off. See Turn Fast Path off section for details.

Netfilter synchronization problems

Symptoms
  • Packets are not filtered according to your iptables rules.

Hints
  • Check whether filtering rules between Linux and fast path are consistent:

    • Check filtering rules in the kernel, using ip[6]tables -S. Refer to the ip[6]tables manpage for details on this command.

    • Check filtering rules known to the fast path, using fp-cli nf[4|6]-table <filter|mangle|nat> [all|nonzero]. Check also whether the filtering module is enabled, using fp-cli nf[4|6]. Some targets and rules are not supported in the fast path: check that you are using only documented supported options.

      Refer to the Fast Path Baseline documentation for details on the filtering module.

Connectivity problems

Symptoms
  • I can no longer connect (via the network) to my machine.

  • The VM was configured to redirect connections to the guest (using something like -netdev user,id=user.0,hostfwd=tcp::35047-:22).

Hints
  • When starting Virtual Accelerator, NIC kernel drivers have been unloaded and thus all IP configuration lost.

Network configuration lost after restart

Symptoms
  • My network configuration no longer works after reboot or Virtual Accelerator restart. For example:

    1. My OVS bridge complains No such device for a port. The port name has been changed from what was initially configured at boot. Now the bridge shows a wrong device:

      # ovs-vsctl show
      d4bf3dfd-1f25-4316-a01b-0bedb17470ab
      Bridge "br0"
          Port "br0"
              Interface "br0"
                  type: internal
          Port "eth1"
              Interface "eth1"
          Port "tap0"
              Interface "tap0"
                  error: "could not open network device tap0 (No such device)"
      ovs_version: "2.4.0-4312f7"
      

      Note

      Last known OVS configuration is automatically applied at startup.

    2. My linux bridge is empty after stopping (or restarting) the fast path:

      # brctl show
      bridge name     bridge id               STP enabled     interfaces
      
Hints
  • The fast path may replace, change, delete and create netdevices. Any tool (brctl, iproute2, etc.) that use “old” references to netdevices must have its configuration refreshed when the fast path is stopped.

    • For OVS, it may be necessary to manually delete and re-add the guilty port from the bridge. e.g.:

      # ovs-vsctl del-port br0 tap0
      # ovs-vsctl add-port br0 tap0
      # ovs-vsctl show
      d4bf3dfd-1f25-4316-a01b-0bedb17470ab
      Bridge "br0"
          Port "br0"
              Interface "br0"
                  type: internal
          Port "tap0"
              Interface "tap0"
          Port "eth1"
              Interface "eth1"
      ovs_version: "2.4.0-4312f7"
      
    • For linux bridge, recreate the bridge and re-add the ports if need be. e.g.:

      # brctl addbr br0
      # brctl addif br0 eth1
      # brctl addif br0 tap0
      
  • To ensure your network configuration will get restored at reboot, consider using /etc/network/interfaces with appropriate options.

    See also

    • Refer to Debian’s documentation on network configuration for persistent configuration.

    • Refer to Open vSwitch’s documentation for details on persistent OVS ports and bridge configuration.

Hotplug ports lost after reboot

Symptoms
  • My hotplug port (initially created with fp-vdev and a VM instanciation) disappeared after reboot or fast path restart.

  • My bridge configuration that used it is now referencing an unexisting netdevice.

Hints
  • Hotplug ports are not persistent. Any tool (ovs-vsctl, brctl, iproute2, etc.) that use “old” references to netdevices must have its configuration refreshed when the fast path is stopped.

  • Remove any references to your hotplug ports from your existing Open vSwitch / linux bridge configuration after a reboot or a fast path restart.

DKMS takes too long

Symptoms
  • Modules recompilation/removal with DKMS takes too long.

Hints
  • Edit the DKMS configuration in /etc/dkms/framework.conf, to prevent it from running some long operations:

    # mkdir -p /etc/dkms
    # echo 'no_initrd="y"' >> /etc/dkms/framework.conf
    # echo 'no_depmod="y"' >> /etc/dkms/framework.conf
    
  • Disable weak-modules:

    # chmod a-x /sbin/weak-modules
    

Open vSwitch synchronization problems

Symptoms
  • My ports are detected by ovs-vsctl but not in the fast path:

    # ovs-vsctl show
    51d477cd-0592-4180-91b3-c5704869ae25
        Bridge "br0"
            Port "ntfp1"
                Interface "ntfp1"
            Port "ntfp2"
                Interface "ntfp2"
            Port "br0"
                Interface "br0"
                    type: internal
        ovs_version: "2.4.0-4312f7"
    
    # fpcmd fp-vswitch-ports
    <Empty output>
    
  • My packets are consequently bridged through Linux (slow path), and appear as exceptions in the stats:

    # fpcmd stats non-zero
    [...]
    ==== exception stats:
      LocalBasicExceptions:78
      LocalExceptionClass:
      FPTUN_EXC_SP_FUNC:78
      LocalExceptionType:
      FPTUN_BASIC_EXCEPT:78
    
Hints
  • Each time the fast path is restarted, you must also restart Open vSwitch.

    When in doubt, clean your bridge and re-configure it. Refer to the Fast Path OVS Acceleration documentation for further details on Open vSwitch related configuration.

Open vSwitch and GRE / VXLAN issues

Symptoms
  • Packets are not bridged through my OVS GRE port.

  • Packets are not bridged through my OVS VXLAN port.

  • Statistics show an increasing number of output_failed_unknown_type:

    # fpcmd fp-vswitch-stats non-zero
      flow_not_found:5
      output_failed_unknown_type:145
      set_tunnel_id:145
    
Hints
  • Check whether your fast path is compiled with support for GRE or VXLAN respectively, using:

    # fp-cli conf compiled | grep GRE=y
    CONFIG_MCORE_GRE=y
    ...
    # fp-cli conf compiled | grep VXLAN=y
    CONFIG_MCORE_VXLAN=y
    ...
    

    If the command output is empty, then your fast path does not support it.

VRRP is unable to work on VMware virtual machines

Symptoms
  • VRRP reports master state on all members but no member receives packets intended for the VRRP virtual IP

Cause
The VMware VSwitch drops frames to MAC addresses that are unknown from the network card properties of the VM.
  • VRRP gives the ability to define a virtual IP that can move between machines. By design, the virtual IP is associated to a virtual MAC address - different from the real network card’s MAC address. Using a virtual MAC address instead of a real MAC address makes the switchover quicker as no update of ARP tables is needed. However, a Virtual MAC address is not supported by the VMware VSwitch. Unlike physical switches, the VSwitch has no MAC learning mechanism capability. The network section of the VM properties defines virtual network cards and one MAC address per card. The VSwitch only knows those addresses to determine on what port to send a frame. Frames to any unknown addresses, including virtual VRRP MAC addresses, are dropped.

  • VRRP uses multicast packets to send VRRP protocol messages between its members. Multicast packets use typical multicast MAC addresses that are also not known by the VSwitch.

Hints
One of the following solutions should be applied:
  • Warning: this solution may impact the performance on VMware hypervisor. Enable promiscuous mode on all VSwitches associated with the VLAN you want VRRP to run on. Basically, the VM will receive all traffic within the VSwitch VLAN. Refer to https://kb.vmware.com/s/article/1002934 for more information.

  • Set up the VRRP instance to disable the usage of a virtual MAC address “vmac” and to use manual unicast peers to exchange VRRP protocol data unit instead of using multicast. This solution is only applicable to VMware and should not be applied on any other context without an explicit request from support. In this mode, the virtual IP address is associated to the real NIC MAC address of the active member. Upon member switchover, a gratuitous ARP is sent to advertise other machines to update their ARP table with the new MAC. You must ensure and test that gratuitous ARP are treated correctly by all machines. If not, some machines would lose connectivity until the ARP cache timeout expires.

Conflict between i40e FW-LLDP and software LLDP agents

Symptoms
  • LLDPDU are not received while the source correctly sends them and the link between both machine works.

Cause
LLDPDU may be consumed by the LLDP engine integrated in the network card firmware:
  • Some Intel network adapter (like Ethernet 700 series) has built-in hardware LLDP engine, which is enabled by default. The LLDP Engine is responsible for receiving and consuming LLDP frames, and also replies to the LLDP frames that it receives. The LLDP engine does not forward LLDP frames to the network stack of the Operating System. The i40e driver enable this feature by default.

Hints
The firmware LLDP must be disabled in i40e ports:
  • For fast path ports:

    # fp-cli dpdk-i40e-debug-lldp-cmd on|off
    
  • For Linux ports: To disable the FW-LLDP:

    # ethtool --set-priv-flags <ifname> disable-fw-lldp off
    

    To check the FW-LLDP setting:

    # ethtool --show-priv-flags <ethX>
    

    See also

    The FW-LLDP (Firmware Link Layer Discovery Protocol) chapter of the i40e Linux Base Driver for Intel controller.