Published by Jay Gill

Since VXLAN was introduced in 2014 it has become an important component of modern data center network fabrics. This blog reviews what VXLAN is, why it was developed, how it is being used in data centers, and advantages over other virtualization technologies. In an upcoming blog, we will look at some innovative VXLAN applications outside the data center.

What is VXLAN?

Virtual eXtensible Local Area Network (VXLAN) is an Internet standard protocol that provides a means of encapsulating Ethernet (Layer 2) frames over an IP (Layer 3) network, a concept often referred to as “tunneling.” This allows devices and applications to communicate across a large physical network as if they were located on the same Ethernet Layer 2 network.

Tunneling approaches such as VXLAN provide an important tool to virtualize the physical network, often called the “underlay,” and allow for connectivity to be defined and managed as a set of virtual connections, called the “overlay.” These virtual connections can be created, modified and removed as needed without any change to the physical underlay network. (Mike Capuano’s blog, What to Know About Data Center Overlay Networks, provides a deeper dive on overlays.)

While VXLAN is only one of many virtual networking or tunneling technologies, it addresses several scaling challenges in data center networks better than alternative technologies, as we will discuss below. Because of these advantages, modern data center architectures for cloud computing now generally combine a “scale-out” IP (L3) leaf-spine underlay based on a robust routing protocol (such as BGP) with a VXLAN-based overlay, as shown in Figure 1.

diagram: Scalable Data Center Fabric Architecture with L3 Underlay and VXLAN Overlay

Figure 1: Scalable Data Center Fabric Architecture with L3 Underlay and VXLAN Overlay

VXLAN Frame Format

Below is a simplified view of the VXLAN frame format.

IP UDP VXLAN Header Encapsulated Ethernet Frame

Figure 2: Simplified VXLAN Frame Format

The VXLAN protocol encapsulates Ethernet frames in a VXLAN header that includes a VXLAN Network Identifier (VNI), a value that distinguishes each VXLAN tunnel (aka “network”). Because the VNI consists of 24 bits, the number of possible VNIs in a network is over 16 million. As we will discuss below, this offers an important scalability advantage over older Virtual Local Area Network (VLAN) technology. VXLAN frames are then encapsulated in UDP (User Datagram Protocol) over IP so they can be routed across a layer 3 network.

VXLAN Tunnel End Points (VTEPs)

The endpoints of the tunnel, where frames are encapsulated and decapsulated, are known as VXLAN Tunnel End Points (VTEPs). This encapsulation may be done on a server that hosts virtual machines (VMs) or containerized applications, or it may be implemented in a network processor in an Ethernet switch. Both server-based and switch-based VTEPs, are represented in Figure 2. In this figure, the VXLAN overlay fabric connects server- and switch-based VTEPs and stretches across two geographically separated data centers connected by a layer 3 routed wide area network (WAN) or other data center interconnect (DCI) transport.

diagram: VXLAN Tunnels and VTEPs in a Multi-site Data Center Fabric

Figure 3: VXLAN Tunnels and VTEPs in a Multi-site Data Center Fabric

In his overlay blog, Mike Capuano discussed the pros and cons of server-based vs. switch-based VTEPs. Server-based VTEPs can support more distributed overlay network services, such as fine-grained microsegmentation for security. However, server-based VTEPs that run in software use server CPU cycles that could otherwise be used by applications. VTEPs implemented in switches are typically hardware accelerated, so they eliminate that performance bottleneck.

Now we are seeing an emerging option that combines the benefits of hardware acceleration with highly distributed services: the data processing unit, or DPU, which is a network interface card (NIC) installed in the server that incorporates powerful data processing silicon to accelerate networking functions including the VXLAN overlay. (DPUs are also called “SmartNICs” but they generally have far greater processing power and functionality than earlier generations of SmartNICs.) We anticipate that many high-performance data center fabrics will incorporate a mix of hardware-accelerated VTEPs in both switches and DPUs, as shown in Figure 3.

VXLAN Goals and Advantages versus Alternatives

VXLAN is not the first or only network virtualization technique, but it offers substantial advantages over many alternatives for data center network virtualization. At a high level, the most important goals for VXLAN are:

When combined with an appropriate control plane technology and network automation framework, VXLAN meets those goals more effectively than many alternative approaches. Let’s review a few of the alternatives.

VXLAN vs. VLANs in Data Center Networks

Data center networks were traditionally built using Ethernet switches without any overlay protocol for virtualization. In this architecture, each switch acts as an Ethernet MAC bridge and implements the spanning tree protocol to avoid loops in the network. In the simplest implementation, all devices and VMs are connected to the same layer 2 broadcast domain. If segmentation or isolation of applications or tenants in these networks is needed, it is provided by Virtual LANs (VLANs), denoted by a 12-bit VLAN ID added to the Ethernet frame header (analogous to the VXLAN virtual network identifier). This extra header is sometimes called a “.1Q tag” alluding to the IEEE 802.1Q standard.

While this type of network works well enough for a single-tenant data center at a small scale, it has many drawbacks for larger scale data centers, especially multi-tenant data centers. With only 4000 unique VLAN IDs available, segmentation options are limited. Perhaps more importantly, the spanning tree protocol is poorly suited to scale-out data center fabrics, both because it makes inefficient use of redundant links and because it is far less resilient than layer 3 routing techniques. Large layer 2 networks are vulnerable to broadcast storms that can take down the entire network.

The VXLAN standard discusses these and other limitations in more detail, and details how using a VXLAN overlay with a layer 3 underlay addresses them.

VXLAN vs. Provider Bridges, VLAN Tag Stacking, Q-in-Q

One approach to avoiding the limit of ~4000 VLAN identifiers is to add a second VLAN tag, an approach referred to as tag stacking or “Q-in-Q” (due to the use of two “.1Q tags”) and covered in the IEEE Provider Bridges standard. The typical service provider use case envisioned for this approach allows the service provider to use the outer tag (S-tag) to provide segmentation or isolation between its customers or tenants while the inner tag (C-tag) is used by the customer, so each customer can use the full range of ~4000 VLANs without concern for what other customers are using.

Destination MAC Source MAC Outer 802.1Q VLAN Tag Inner 802.1Q VLAN Tag Ethertype + Payload + CRC

Figure 4: Simplified QinQ Frame Format

Using two tags allows for up to 16 million unique combinations (equivalent to VXLAN though somewhat less flexible than using a single 24-bit VNI), so that addresses VLAN scalability issues. It does not, however, address the inefficiency and poor resilience inherent in layer 2 networks.

VXLAN vs. TRILL and Shortest Path Bridging

Subsequent standards known as “Transparent Interconnection of Lots of Links (TRILL)” and “Shortest Path Bridging (SPB)” attempted to address the efficiency and resilience problems of spanning tree by borrowing from layer 3 link-state routing, specifically the widely used IS-IS routing protocol, which does not require an IP network. These are sometimes referred to as “MAC-in-MAC” approaches because a second Ethernet MAC address is added to the frame for forwarding between the TRILL-enabled or SPB-enabled bridges.

Both standards gained attention and they were often compared and contrasted, but neither achieved consensus. They also shared a significant drawback, which was the need for specialized hardware. Some implementations, such as Cisco’s FabricPath, also diverged from the standards, raising concerns about interoperability and vendor “lock in.”.

VXLAN vs. MPLS for Data Center Fabrics

MPLS Layer 2 VPNs (L2VPNs) provide layer 2 connections across a layer 3 network, but not just any layer 3 network. The routers in the network must all be IP/MPLS routers. Virtual networks are isolated using MPLS pseudowire encapsulation and MPLS labels can be stacked, analogous to VLAN tag stacking, to enable large number of virtual networks.

IP/MPLS is commonly used in telecom service provider networks, and as a result many service provider L2VPN services are implemented with MPLS. These include point to point L2VPNs, sometimes called pseudowires, and multipoint L2VPNs implemented according to the Virtual Private LAN Service (VPLS) standard. These services often conform to Metro Ethernet Forum (MEF) Carrier Ethernet service definitions for E-Line (point to point) and E-LAN (multipoint), respectively.

Because MPLS and its associated control plane protocols are designed for highly scalable layer 3 service provider networks, some data center operators have used MPLS L2VPNs in their data center networks to overcome the scaling and resilience limitations of layer 2 switched networks, as shown in Figure 4.

diagram: MPLS-based Data Center Fabric

Figure 5: MPLS-based Data Center Fabric

This approach did not become widespread for several reasons.

Given its advantages, VXLAN is overwhelmingly preferred to MPLS in data center networks. (In fact, VXLAN is even proving to be a viable alternative to MPLS to provide Carrier Ethernet services in some service provider networks, a topic that we will explore more in an upcoming blog.)

VXLAN vs. Other Overlay Protocols

VXLAN was not the first attempt to define an overlay protocol capable of extending layer 2 services across pure layer 3 underlays.

Technically, VXLAN, NVGRE and GENEVE all provide very similar capabilities and they can all work with the same control planes, such as SDN or BGP EVPN, but so far VXLAN is far more widely implemented.

Summary of Virtualization Technologies

The table below summarizes many of the key points made above comparing VXLAN to alternative data center network virtualization technologies.

Table of Summary of Virtualization Technologies


VXLAN Control Planes and Automation

In principle, VXLAN overlays can be manually configured with static MAC to VTEP IP address mapping, but in practice some type of control plane or automation framework is needed to achieve meaningful network scalability and agility. The VXLAN standard describes a data plane learning approach and also emphasizes that other control plane options are possible.

VXLAN Data Plane Learning with IP Multicast

The approach described in the VXLAN standard extends standard MAC address learning to create MAC to VTEP IP address mapping without fundamentally altering the way learning works. IP Multicast in the underlay is used to transmit layer 2 broadcast/unknown/multicast (BUM) traffic. This approach has some drawbacks relative to other approaches described below. First, it expands layer 2 broadcast and failure domains, rather than isolating them. Second, the use of IP Multicast tightly couples the underlay network to the overlay and increases management complexity, compared to a typical IP (unicast) network.

BGP EVPN Control Plane for VXLAN

BGP EVPN provides an increasingly popular, standards-based approach to create VXLAN overlay networks meeting several objectives:

BGP EVPN uses the BGP protocol running in each switch to communicate MAC address and other information among network nodes, so we refer to it as a “protocol-based” control plane.

Configuring BGP EVPN services on every switch in the network can be complex, time-consuming and error-prone, so some network operators look to network automation tools, which may include SDN automation, to reduce complexity and improve provisioning speed.

SDN Control Plane and Automation for VXLAN

SDN can also be used not just to automate BGP EVPN, but to provide a protocol-free alternative to BGP EVPN. An SDN control plane can achieve all the same objectives described above for BGP EVPN without requiring BGP EVPN to be configured on every switch. In an SDN-enabled VXLAN overlay, the SDN control plane takes care of MAC address learning and efficient forwarding, while also providing comprehensive end-to-end network automation, which can result in a network that is orders of magnitude simpler to configure and operate.

Figure 6 compares the operational complexity to configure a single service (in this case a Layer 3 virtual routing and forwarding, or VRF instance) across a 128-node VXLAN overlay fabric. Applying SDN automation to BGP EVPN configuration can result in an order of magnitude simplification versus manual configuration, while adopting a full SDN automation approach can simplify the task by roughly three orders of magnitude.

diagram: SDN Automation Benefits for VXLAN Overlay Provisioning

Figure 6: SDN Automation Benefits for VXLAN Overlay Provisioning

To learn more about VXLAN control plane options, see my blog, BGP EVPN for Scaling Data Center Fabrics, which describes how BGP EVPN works, compares the pros and cons of BGP EVPN and SDN control planes, and describes how they can be used together to meet a wide range of fabric scaling and interoperability scenarios.


VXLAN has become the most popular protocol for overlay network virtualization in data center fabrics due to its advantages over a long list of alternatives. When implemented in hardware-based VTEPs in switches and DPUs and combined with a BGP EVPN or SDN control plane and network automation, VXLAN-based overlay networks can provide the scalability, agility, high performance and resilience needed for distributed cloud networking into the foreseeable future.