VXLAN

Virtual eXtensible Local Area Networks (VXLAN) is an overlay technology that allows the encapsulation of Ethernet frames over IP and UDP. It allows creating Layer-2 VPNs by interconnecting multiple Ethernet segments across a Layer-3 IP network. While various encapsulation methods have been used historically for this purpose, VXLAN has become the industry standard (see RFC 7348). VXLAN is specifically designed for Ethernet frame transport and avoids the complexity of MPLS-based solutions. Operating as a UDP-based Layer 4 protocol, VXLAN integrates seamlessly with existing IP infrastructure.

Ethernet frames are encapsulated within a VXLAN-over-UDP header before being forwarded across the IP network. The VXLAN header carries a VNI that uniquely identifies the Layer-2 segment, in the same way a VLAN tag distinguishes segments in an IEEE 802.1Q frame header. With 24 bits, the VNI field supports over 16 million distinct segments, offering significantly greater scalability than the 12-bit VLAN identifier that offers approximatively 4000 unique values.

VXLAN Tunnel Endpoints (VTEPs) are the entities responsible for originating and terminating VXLAN tunnels. They perform the encapsulation of Ethernet frames into VXLAN packets and the decapsulation of incoming VXLAN packets back into Ethernet frames. VTEPs support both point-to-point and point-to-multipoint tunneling modes. In point-to-point mode, VXLAN traffic is sent via unicast to a specific VTEP using its IP address. In point-to-multipoint mode, traffic is transmitted to multiple VTEPs using a multicast group address.

To optimize forwarding and avoid broadcasting all traffic, VTEPs maintain a MAC forwarding table, similar to traditional Ethernet switches. When a VXLAN packet is received, the source MAC address and the originating VTEP’s unicast IP address are recorded, creating a mapping entry. This allows future packets destined to that MAC to be sent directly to the corresponding VTEP.

A major drawback common to both tunneling modes is that multicast or broadcast traffic and frames destined to unknown MAC addresses (aka. BUM traffic) are encapsulated and forwarded as-is. As with traditional Layer 2 switches, this behavior can lead to the formation of Layer 2 loops, where flooded frames are duplicated endlessly. Such loops can cause broadcast storms, consuming all available network and bandwidth resources, and potentially leading to complete network failure.

See also

Using VXLAN with the BGP EVPN control-plane eliminates the need for BUM flooding, significantly mitigating the risk of Layer-2 loops.

Configuration of a basic VXLAN tunnel

Hosts h1 and h2 belong to the same IP subnet and would typically be reachable through a shared Layer 2 broadcast domain. However, in this scenario, they are not physically connected to the same Layer 2 segment. To bridge this gap, vtep1 and vtep2 Virtual Service Routers will be configured to interconnect the hosts using VXLAN tunneling.

../../../../_images/aafig-d4fb6fde607ef14450e4b158c893377480f8285a.svg

vtep1 and vtep2 VTEPs are interconnected over the 10.125.0.0/24 subnet, which serves as the Layer 3 underlay network used for VXLAN encapsulated traffic between them.

vtep1 running config# / vrf main interface physical eth-vtep2 ipv4 address 10.125.0.1/24
vtep1 running config#! / vrf main interface physical eth-vtep2 port pci-b0s5
vtep2 running config# / vrf main interface physical eth-vtep1 ipv4 address 10.125.0.2/24
vtep2 running config#! / vrf main interface physical eth-vtep1 port pci-b0s5

The vxlan100 VXLAN interface is configured on both VTEPs, using a common VNI value of 100 to identify the same virtual Layer 2 segment. Encapsulated traffic is transmitted to the remote VTEP’s IP address over the Layer 3 underlay links eth-vtep1 and eth-vtep2, which provide connectivity between the VTEPs.

vtep1 running config# / vrf main interface vxlan vxlan100
vtep1 running vxlan vxlan100#! vni 100
vtep1 running vxlan vxlan100# link-interface eth-vtep2
vtep1 running vxlan vxlan100# remote 10.125.0.2
vtep2 running config# / vrf main interface vxlan vxlan100
vtep2 running vxlan vxlan100#! vni 100
vtep2 running vxlan vxlan100# link-interface eth-vtep1
vtep2 running vxlan vxlan100# remote 10.125.0.1

The interfaces facing the hosts are connected to the VXLAN interfaces through a bridge interface named br100. This bridge allows local Ethernet interfaces and the VXLAN tunnel to operate within the same Layer 2 broadcast domain.

vtep1 running config# / vrf main interface physical eth-h1 port pci-b0s4
vtep1 running config# / vrf main interface bridge br100
vtep1 running bridge br100# link-interface eth-h1
vtep1 running bridge br100# link-interface vxlan100
vtep2 running config# / vrf main interface physical eth-h2 port pci-b0s4
vtep2 running config# / vrf main interface bridge br100
vtep2 running bridge br100# link-interface eth-h2
vtep2 running bridge br100# link-interface vxlan100

h1 is now able to successfully ping h2.

root@h1:~# ping -c1 192.168.0.2
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.
64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=3.83 ms

--- 192.168.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.829/3.829/3.829/0.000 ms

The VXLAN tunnel operates in point-to-point mode because the remote IP address is unicast. To enable point-to-multipoint mode, a multicast group address can be used instead. This allows additional VTEPs to participate in the same Layer 2 segment. In this mode, all participating VTEPs must be configured with the same multicast group to ensure proper frame distribution across the overlay. The unicast address is removed because a VXLAN interface cannot operate in both point-to-point and point-to-multipoint modes simultaneously.

vtep1 running config# / vrf main interface vxlan vxlan100
vtep1 running vxlan vxlan100# del remote
vtep1 running vxlan vxlan100# group 239.0.0.8
vtep2 running config# / vrf main interface vxlan vxlan100
vtep2 running vxlan vxlan100# del remote
vtep2 running vxlan vxlan100# group 239.0.0.8

h1 is still able to successfully ping h2.

root@h1:~# ping -c5 192.168.0.2
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.
64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=1.96 ms
64 bytes from 192.168.0.2: icmp_seq=2 ttl=64 time=1.86 ms
64 bytes from 192.168.0.2: icmp_seq=3 ttl=64 time=1.86 ms
64 bytes from 192.168.0.2: icmp_seq=4 ttl=64 time=0.981 ms
64 bytes from 192.168.0.2: icmp_seq=5 ttl=64 time=1.26 ms

--- 192.168.0.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4007ms
rtt min/avg/max/mdev = 0.981/1.586/1.962/0.391 ms

Warning

All VXLAN interfaces must be configured consistently in the same L3VRF. Mixing VXLAN interfaces between the different L3VRF contexts is not supported by Virtual Service Router and will lead to unpredictable behavior.

See also

The command reference for details.

Forwarding Database

By default, a VTEP maintains a VXLAN forwarding database (FDB), which serves a similar purpose to the FDB in a traditional Ethernet switch. When a frame is destined for a MAC address not present in the FDB, it is encapsulated and flooded either to all VTEPs via the configured group multicast address or to a specific VTEP defined by the remote unicast address. When a VTEP receives an encapsulated VXLAN packet and the VNI matches a local VXLAN interface, an entry is added to the FDB. This entry maps the source MAC address of the inner Ethernet frame to the source IP address of the VXLAN packet (that is the IP address of the remote VTEP). Subsequent frames destined for this MAC address will then be forwarded directly to the corresponding VTEP, eliminating the need for flooding.

Traffic from host h1 (with MAC address de:ad:de:9e:f5:5e) has resulted in the creation of an FDB entry associated with the vxlan1 interface only. This entry specifies that subsequent frames destined for this MAC address should be forwarded exclusively to vtep2 using the IP address 10.125.0.2.

vtep1> show vxlan fdb name vxlan100
neighbor   interface link-layer-address link-interface port vni state
========   ========= ================== ============== ==== === =====
239.0.0.8  vxlan100  00:00:00:00:00:00  eth-vtep2               permanent
10.125.0.2 vxlan100  ba:8f:73:5d:88:13                          reachable
10.125.0.2 vxlan100  3a:ef:f5:a2:3b:fa                          reachable
10.125.0.2 vxlan100  de:ed:01:dc:0e:62                          reachable

See also

The command reference for details.

This entry has a timeout of several minutes. If no traffic is received from the associated MAC address during that period, the entry is automatically removed. You can also manually flush entries using the flush vxlan fdb command, which supports flushing multiple entries at once. :

vtep1> flush vxlan fdb name vxlan100 link-layer-address de:ed:01:dc:0e:62 neighbor 10.125.0.2
OK.
vtep1> show vxlan fdb name vxlan100
neighbor   interface link-layer-address link-interface port vni state
========   ========= ================== ============== ==== === =====
239.0.0.8  vxlan100  00:00:00:00:00:00  eth-vtep2               permanent
10.125.0.2 vxlan100  ba:8f:73:5d:88:13                          reachable
10.125.0.2 vxlan100  3a:ef:f5:a2:3b:fa                          reachable

The FDB learning capability is now disabled on the vxlan100 interface, and a static permanent entry is manually configured to associate the host h2 MAC address with the ipv4 address of vtep2.

vtep1 running config# / vrf main interface vxlan vxlan100
vtep1 running vxlan vxlan100# learning false
vtep1 running vxlan vxlan100# ipv4 fdb link-layer-address de:ed:01:dc:0e:62 ip 10.125.0.2
vtep1> show vxlan fdb name vxlan100
neighbor   interface link-layer-address link-interface port vni state
========   ========= ================== ============== ==== === =====
239.0.0.8  vxlan100  00:00:00:00:00:00  eth-vtep2               permanent
10.125.0.2 vxlan100  de:ed:01:dc:0e:62                          permanent

A permanent FDB entry cannot be removed using the flush vxlan fdb command. To delete such an entry, you must remove it explicitly from the configuration.

vtep1 running config# / vrf main interface vxlan vxlan100
vtep1 running vxlan vxlan100# learning true
vtep1 running vxlan vxlan100# del ipv4 fdb link-layer-address de:ed:01:dc:0e:62 ip 10.125.0.2

See also

The command reference for details.

Connecting a switch to a VTEP

Hosts h1 and h2 are now connected through switches sw1 and sw2, with their ports assigned to VLAN 100. This VLAN is propagated to the VTEPs over 802.1Q tagged links, and traffic tagged with VLAN 100 is mapped to VXLAN VNI 100.

../../../../_images/aafig-18d015155403e3401e106dc70007414a1b688c24.svg
vtep1 running config# del / vrf main interface bridge br100 link-interface eth-h1
vtep1 running config# del / vrf main interface physical eth-h1
vtep1 running config# / vrf main interface physical eth-sw1 port pci-b0s4
vtep1 running config# / vrf main interface vlan eth-sw1.100 vlan-id 100
vtep1 running config#! / vrf main interface vlan eth-sw1.100 link-interface eth-sw1
vtep1 running config# / vrf main interface bridge br100 link-interface eth-sw1.100
vtep2 running config# del / vrf main interface bridge br100 link-interface eth-h2
vtep2 running config# del / vrf main interface physical eth-h2
vtep2 running config# / vrf main interface physical eth-sw2 port pci-b0s4
vtep2 running config# / vrf main interface vlan eth-sw2.100 vlan-id 100
vtep2 running config#! / vrf main interface vlan eth-sw2.100 link-interface eth-sw2
vtep2 running config# / vrf main interface bridge br100 link-interface eth-sw2.100

h1 is still able to successfully ping h2.

root@h1:~# ping -c1 192.168.0.2
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.
64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=3.62 ms

--- 192.168.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.618/3.618/3.618/0.000 ms

Full configuration

The following configuration examples show the complete configuration used throughout this page:

vtep1> show config fullpath nodefault / vrf main
/ vrf main interface physical eth-sw1 port pci-b0s4
/ vrf main interface physical eth-vtep2 ipv4 address 10.125.0.1/24
/ vrf main interface physical eth-vtep2 port pci-b0s5
/ vrf main interface bridge br100 link-interface eth-sw1.100
/ vrf main interface bridge br100 link-interface vxlan100
/ vrf main interface vlan eth-sw1.100 vlan-id 100
/ vrf main interface vlan eth-sw1.100 link-interface eth-sw1
/ vrf main interface vxlan vxlan100 vni 100
/ vrf main interface vxlan vxlan100 group 239.0.0.8
/ vrf main interface vxlan vxlan100 link-interface eth-vtep2
vtep2> show config fullpath nodefault / vrf main
/ vrf main interface physical eth-sw2 port pci-b0s4
/ vrf main interface physical eth-vtep1 ipv4 address 10.125.0.2/24
/ vrf main interface physical eth-vtep1 port pci-b0s5
/ vrf main interface bridge br100 link-interface eth-sw2.100
/ vrf main interface bridge br100 link-interface vxlan100
/ vrf main interface vlan eth-sw2.100 vlan-id 100
/ vrf main interface vlan eth-sw2.100 link-interface eth-sw2
/ vrf main interface vxlan vxlan100 vni 100
/ vrf main interface vxlan vxlan100 group 239.0.0.8
/ vrf main interface vxlan vxlan100 link-interface eth-vtep1