My recent blog described how VXLAN was developed and how it is being widely used in data centers due its advantages over other virtualization technologies, including MPLS. This blog extends that MPLS comparison and describes why VXLAN is starting to replace MPLS in metro and wide-area networks as a better approach to deliver Ethernet services.
VXLAN or MPLS?
First a quick recap of what VXLAN is: Virtual eXtensible Local Area Network (VXLAN) provides a means of encapsulating Ethernet (Layer 2) frames over an IP (Layer 3) network, so devices and applications can communicate across a large physical network as if they were located on the same Ethernet Layer 2 network. In the data center, we call the physical network the underlay and the collection of VXLAN tunnels the overlay. A crucial point about how VXLAN tunnels work is that they only need to be configured at two locations, the VXLAN tunnel end-points (VTEPs). The underlay between those endpoint devices simply provides standard layer 3 IP connectivity.
The previous blog outlined why VXLAN is overwhelmingly preferred to MPLS in data center networks, summarized in three points:
- MPLS-capable routers tend to be more costly than non-MPLS routers, and much more costly than data center-class layer 3 switches with VXLAN support.
- MPLS-based VPN solutions require tight coupling between edge and core devices, so every node in the data center network must be MPLS-capable, unlike VXLAN overlays.
- MPLS expertise is not widespread among data center network engineers.
How do these points translate to the very different context of service provider metro networks? Let’s take them one by one.
MPLS router cost
Some service providers have long been attracted to the idea of building lower cost metro networks with data center-class switches. Over 20 years ago, the first generation of competitive metro Ethernet service providers such as Yipes and Telseon built their networks with gigabit Ethernet switches that were then state-of-the-art in enterprise networks. Historically, such networks struggled to provide the scalability and resilience demanded by larger SPs, as shown in Figure 1.
Figure 1: Traditional Layer 2 Network
As a result, most larger SPs gravitated to MPLS (Figure 2). However, MPLS routers were clearly more costly than commodity Ethernet switches and that cost difference has not gone away in the intervening decades.
Figure 2: IP/MPLS Network
Today’s data center-class switches, coupled with VXLAN overlay architectures, can largely eliminate the drawbacks of pure L2 networks without the cost of MPLS routing, leading to a fresh surge of SP interest. More on this topic below.
Tight coupling between core and edge
Service provider MPLS networks are also characterized by tight coupling and the associated operations complexity, which was problematic in the data center. Many service providers view that complexity as a necessary price to pay in order to have tools such as quality of service (QoS) and traffic engineering (TE) to support their service level agreements (SLAs). But the rise of SD-WAN and other overlay services is making SPs re-examine that point of view. We’ll dive into that trend further as well.
In the data center, MPLS expertise is rare and costly, presenting a high barrier to adoption. But in service providers who have been running MPLS networks for decades, the tables are turned. Familiarity with MPLS, plus a large installed base of MPLS equipment and mature operations systems and process, are often the biggest obstacles to adoption of any newer approach, such as a VXLAN-based overlay.
Data Center Switching Advances Enable VXLAN-based Metro and Wide-area Networks
Today’s data center switching chips, such as Broadcom’s Trident 3 and Trident 4, incorporate numerous capabilities that make VXLAN-based metro networks feasible. Two key examples:
- Hardware-based VTEPs enable VXLAN encapsulation at line rate
- Expanded table sizes provide the routing and forwarding scale required to create resilient, scalable layer 3 underlay networks and multi-tenant overlay services.
Equally important, newer data center class switches all have powerful CPUs that can support the advanced control planes critical to efficient scaling of Ethernet services, whether BGP EVPN (a protocol-based approach) or an SDN-based protocol-free control plane.
As a result, in many metro networking applications there is no longer a need for specialized (i.e. high cost) routing hardware.
VXLAN Overlay Architecture for Metro and Wide Area Networks
Overlay networking approaches are already widely accepted in applications as diverse as data center fabrics (where VXLAN arose) and enterprise software-defined wide area networks (SD-WANs). One key thing these modern overlays have in common is they are loosely coupled with the underlying network(s). In principle, the underlay may be built from any network technology and use any control plane, as long as the network provides sufficient capacity and resilience. The overlay is only defined at the service endpoints and there is no service provisioning in the underlay network nodes. This approach echoes the end-to-end principle, one of the core architectural concepts of TCP/IP and the Internet. One of SD-WAN’s key benefits is that it can use a variety of networks, including broadband or wireless Internet services that are widely available and cost-effective, with sufficient performance for many users and applications.
When VXLAN overlays are applied to metro and wide area networks, similar benefits apply. Any capable IP transport network can be used and services are provisioned only in the overlay, as shown in Figure 3. As one respected network observer put it, “All we need in a VXLAN world is IP connectivity… and some hope that the core IP network plans to deliver the traffic.”
Figure 3: VXLAN Overlay Architecture
Of course, the underlay IP network must meet the resilience and performance criteria of the applications that will use the overlay services, so while the Internet may be an adequate underlay for some SD-WAN deployments, for many other services it introduces an unacceptable “achilles heel,” as a recent blog put it.
When building a metro network to deliver services such as an Ethernet Private Line (E-Line), a multipoint Ethernet Local Area Network (E-LAN) or a layer 3 VPN (L3VPN), care must be taken to ensure that the underlay can meet the stringent SLAs that usually apply to such services. Fortunately, building a robust, high performance IP network is possible using cost-effective open networking platforms based on commodity silicon and standard routing protocols available in open-source software such as FRRouting. This is how Pluribus has built underlay networks for data center fabrics for years, and the same approach is working well today in several deployed metro networks.
Overlay Control Plane Options for VXLAN-based Metro Networks
So far we have focused mostly on benefits of VXLAN over MPLS in terms of network architecture and capital cost, i.e. mostly data plane benefits. But VXLAN does not have a specified control plane, so if we want to complete the comparison of VXLAN-based metro networks with MPLS-based networks, we need to look at overlay control plane options.
Probably the best-known control plane option for creating VXLAN overlays and provisioning overlay services is BGP EVPN, a protocol-based approach in which services must be configured in every edge node. The biggest drawback of BGP EVPN is operational complexity, which we have discussed in other blogs (for example, here and here). If we compare that complexity to MPLS provisioning complexity, it’s not clear we have gained any operational agility benefits by moving to a VXLAN-based architecture.
An alternative, protocol-free approach is to use software-defined networking (SDN) with services defined in the SDN controller, which in turn programs the data plane of each edge node. This eliminates most of the operational complexity of protocol-based BGP EVPN. However, as some observers have noted a centralized SDN controller architecture (which can be acceptable for a data center fabric within a single site) creates severe scalability and resilience challenges when applied to metro and wide area networks. So once again, it’s not clear whether this is a better choice than MPLS for the metro.
Fortunately, a third alternative is available: decentralized or distributed SDN, in which the SDN controller function is fully replicated and distributed throughout the network. (This can also be referred to as “controllerless” SDN because it eliminates the need for separate controller servers/devices.) This is the approach Pluribus takes in our Unified Cloud Fabric and it completely eliminates the scalability and resilience problems of centralized SDN control while preserving the benefits of simplified and accelerated service provisioning.
Table 1 below summarizes these options:
|Architecture||Network Scale||Operational Agility|
|VXLAN + BGP EVPN||High||Low|
|VXLAN + Centralized SDN||Low||High|
|VXLAN + Distributed SDN (potentially with BGP EVPN*)||High||High|
* At Pluribus we combine the distributed SDN approach with the use of standards-based BGP EVPN in order to achieve the “best of both worlds,” i.e. SDN automation and operational simplicity along with interoperability and service extension to other fabrics.
Table 1: Comparing MPLS and VXLAN Options for Metro Networking
Because VXLAN allows decoupling of overlay service delivery from the underlay network, it creates deployment options that MPLS cannot match, such as a virtual services overlay on an existing IP underlay, as shown in Figure 4. VXLAN-based switches are deployed at the edge of the existing network where needed, and expanded as driven by service demand, so new Ethernet and VPN services and new revenues can be added without any change in the existing network.
Figure 4: VXLAN Overlay Deployment on Existing Metro Network
For greenfield or overbuild situations, the entire metro network infrastructure can be built on open, disaggregated switching as shown in Figure 5. Such a metro network infrastructure can support all the services that an MPLS-based network could provide, including business Internet, Ethernet and VPN services as well as consumer triple play services, while completely eliminating the cost and complexity of MPLS.
Figure 5: Converged Metro Core with VXLAN Services Overlay
Additional Considerations for VXLAN in Metro Networks
While there are compelling benefits to be gained from using VXLAN instead of MPLS in metro networks, some SPs may be reluctant to make the move for a variety of reasons.
Of course, the biggest obstacle is transition costs. Building new networks, integrating them into existing systems and creating new operating processes are all challenging. The overlay deployment model described above can make it easier to continue running the existing network while incrementally growing services on the new overlay.
Another objection that SPs may raise is the lack of equivalent tools to those available in MPLS networks for traffic engineering and service assurance. Pluribus customers who have already made the transition have addressed this problem through conservative engineering of capacity in the underlay network, such that capacity-related performance issues are essentially non-existent. Over time, we expect that new tools such as segment routing (e.g. SRv6) will satisfy those SPs who feel they need more deterministic traffic engineering capabilities. As one recent blog put it, “a network can eliminate MPLS (and MPLS-TE) and still have a fancy traffic engineering option available to them by using SRv6.”
VXLAN is building on its popularity for data center overlay network virtualization and displacing MPLS in some service provider metro networks. When implemented in hardware-based VTEPs in switches — and even in server-based SmartNICs or data processing units (DPUs) — and combined with a BGP EVPN or SDN control plane and network automation, VXLAN-based overlay networks can provide the scalability, agility, high performance and resilience needed for metro networking into the foreseeable future.