Of Controllers and Why Nicira had to do a deal (Part III: SDN And Openflow – Enabling Network Virtualization in the Cloud)

The challenges faced by Openflow and SDN

This is the 3rd and final article in this series (Click for Part I and Part II). As promised, let’s look at some of the challenges facing this space and how we are addressing those challenges. In the process, we also look at the role of controllers in a network fabric in more detail in this post.

Challenge 1 – Which is why Nicira had to get a big partner

I see plenty of articles talking about Nicira being aquired. But there’s one question that no one seems to be asking – that if the space is so hot, why did Nicira do the deal so early? The deal size (1.26B) is not chump change but if I were them and my stock was rising, I would have held off to continue pushing to change the world. So what was the rush? I believe the answer lies in some of the issues I discussed in Part II of this series – in the difference between a server-based and switch based approach! The Nicira solution was very dependent on the server and the server hypervisor. The world of server operating systems and hypervisors is so fragmented today, that staying independent would have been a very uphill battle for them. So tying up with one of the biggest hypervisors made sense for them. I believe they did the smart thing to ensure that the technology can keep moving forward. Undoubtedly, VMware is a good, technology-driven company. The moot question now is how long before the VMware/EMC and Cisco relationship comes unhinged?

Challenge 2 – Why Controllers? Or the divide between Control Plane and Data Plane

The current promise of having a standard way of controlling networking and a controller that is platform independent is huge. It provides simplified network management and rapid scale for virtual networks. Yet, the current implementations have become problematic.

Since the switches are dumb and do not have a global view, the current controllers have turned into a policy enforcement engine as well. New flow setup requires a controller to agree which means that every flow now needs to go through the controller which in turn instantiates them on the switch. This, however, raises several issues:

  • A controller which is essentially an application running on a server OS over a 10gbs link (with a latency of tens of milli-seconds) is in-charge of controlling a switch which, in turn could be switching 1.2 Tbps of traffic at an average latency of under a μs and dealing with 100k flows with ~30% of them being setup or torn down every second. To put things in perspective, a controller takes tens of milliseconds to set up a flow while the life of a flow transferring a 10Mb data (typical web page) is 10 msec!!
  • To deal with 100k flows, the switch ASICs need to have that kind of flow capability. The current (and coming generation) of ASICs have no where near such capability so one can only use the flow table as a cache. This brings to us the 3rd issue.
  • Flow setup rate is anemic at best on the existing hardware. You are lucky if you can get 1000 flows per second.

So what is lacking is a Network Operating system on the switch to support the Controller App. If you look at the server world, the administrator specifies the policies and it’s the job of the OS (that works very closely with the Hardware) to enforce these policies. In the current scenario, it seems like an application running on bare metal with no Operating system support. Since this is a highly specialized application, it needs a specialized Operating system – or, a Network Operating system which can also be virtualized.

Challenge 3 – The Controller based Network

For a while, people were just tried to their inflexible networks which didn’t see any innovation in last two decades while the server and storage went through major metamorphosis. That frustration gave birth to Openflow/SDN which has currently morphed into a controller mania. Moving the brain from body and separating them creates somewhat of a split brain problem since the body (or switch in this case) still needs somewhat of a brain. What we need is a solution that encompasses the entire L2 fabric and the controller and Fabric work as one while providing easy abstractions for user to achieve their virtualization, SLA and monitoring needs.

A Distributed Network Hypervisor or Netvisor to the rescue

So what we (at Pluribus) saw early on is that the world of servers is a very good example. The commoditization of ASICs and value moving to software is pretty much what’s happening in the world of storage and is bound to happen in the world of networking. So we decided to do things in the right order i.e. get the bleeding edge commodity ASICs and create a Network Operating System with the following properties:

  • Network OS – Designed to manage  these ASICs which are very specialized and powerful.
  • Distributed – Networks have more than one switch, working in tandem to support end-to-end flow.
  • Virtualized – Ability to run both Physical and Virtual Networking applications. As I mentioned before, the switch is not the network. We need to deal with all network services in physical and virtual form and the network OS needs to support that.

Hence we created a Distributed Network Hypervisor called Netvisor™ (a key component of the nvOS™ operating system) or nvOS™. It is designed to run on the network switches and supports both physical and virtual network services. It also runs a controller where the controller is a policy distribution engine and no longer a policy enforcement engine.

 

As shown in the left half of the above figure, the current dividing line between the control plane and data place is not going to scale and perform. The line we originally drew (and the founding principle of Pluribus Networks), as shown in the right-half, needs to be delivered for SDN and Openflow to deliver its true promise.

SDN and Openflow- Enabling Network Virtualization in the Cloud: Part II

Using Openflow – state of the ART

In an earlier article, we discussed the components of Openflow and building blocks of a Software Defined Network. In this part, we’ll discuss some interesting things being done today to make it all work. However, before we do that, it’d be worthwhile to first discuss the concept or definition of a flow and why is it important.

 

What is a Flow and the split between Hardware and Software

A flow is a simple mechanism to identify a group of packets on the wire. So a packet coming from a particular machine can be identified by the machine’s MAC or IP address which appears as source MAC in L2 header or source IP in L3 header. By putting a flow rule around either of those fields and just counting the packets going through the switch that satisfy that rule, we can determine the number of packets being sent by the machine. This is useful information. To make it even more useful, one could add another flow to measure the packets going to our target machine. Adding a destination MAC or destination IP rule based on the machine’s MAC or IP address will accomplish that and using our two flows, we can now find out how many packets are coming and going out of the machine. The next question is, who is implementing the rules we just discussed above. There are several approaches and advantages/disadvantages of each approach, which I have discussed below:

  • Server Based Approach - Have S/W in the server itself to do the implementation of the rules. It’s the easiest way since the server already has to process the packets and it can keep track. The issue with this approach arises when it’s not a real server but a virtual machine on the server that we want to track. We can still let the hypervisor track the packets or ask the Virtual Machine to track it. The big disadvantage of this approach is that asking the server to do stuff on your behalf needs certain trust (security holes and digital certificates come to mind), depends on its capability and lowers performance. Since the server hypervisor has to measure these things, it needs to see the packets, making hardware based virtualization (SR-IOV) hard to adopt. Most of the data center bridging standards and IO virtualization standards are today going towards hardware based switch in the server and doing things in the S/W layers of the server is not going to be possible.
  • Server based with H/W offload - There is more talk around this than any real implementation but its worth mentioning that people have discussed putting special capabilities in the NICs on server to offload some flow processing. The advantage is performance and security (since the Hypervisor controls the NIC, the Virtual Machine can’t circumvent it). The disadvantage is cost and scaling issues. The chips capable of doing this (TCAMs etc) are expensive and trying to orchestrate across large numbers of servers severely limits the scale. We are already seeing Intel Sandy Bridge architecture coming to life which is integrating 10GigE NICs. Adding TCAMs will only increase the basic cost by $800-900 and also add significant complexity.
  • Probe Based Approach - Have probes in the network to do it. There are companies  which specialize in inserting probes in the network and collecting data that can do this quite well, as long as you only want to observe things. If redirection or traffic shaping action is needed, these passive probes will not work and inserting them requires intrusive work in cabling etc. Not my favorite approach either.
  • Switch based approach - Since all the traffic passes through the switches anyway, having them do it makes a lot more sense. The modern switch chips have H/W based CAMs and TCAMs which can take a rule and do the needful without adding to the latency or throughput of the packet stream. In my past life, as Architect of Solaris Networking and Network Virtualization, I have done the software based approach, but given the growing Virtual machine density, SR-IOV type features and growing need for analytics and traffic shaping with performance, I think that the switch based approach is far superior. Here, the CAM and TCAM that measure flows are the Hardware pieces. The software piece is able to add and delete rules on the fly. And Openflow provides a pseudo standard that allows a programmer to work and program any switch. But the biggest advantages are scale, ease of use, and administrative separation of this approach. The scale comes from orchestrating your flows and policies across fewer devices (one switch for approx 50 servers). Also, the people in charge of networks and storage networks are at times different and keeping the administrative separation is useful although not required.
So needless to say, we have currently taken the approach of solving this problem on the switch in conjunction with coordinating with the host, using standards like EVB/DCB etc which we will discuss at a later time. Given that new generation switch chips are very similar to the server CPU and also have the same complexity, the problem calls for a real Operating System on top, which can then help us write openflow based applications. This is where we step in. Part of Pluribus Networks’ effort is around implementing a distributed network hypervisor (called Netvisor™, a key component of the nvOS™ operating system) to give openflow programmers real teeth. We treat any switch chip the same as a server chip and most of the code is platform independent with very little that is written to the chip’s instruction set. Just the same way Linux code (with little platform specific stuff) runs on x86 or Power and OpenSolaris codes run on x86 and Sparc.

 

Current implementations

Now a little overview of projects and people who are leading the charge in the brave world of flows and Software Defined Networking. Before raking me over coal on the missing things, let me clarify that the stuff below is what I consider mainstream implementations that apply in world of data centers today (Disclaimer: I have purposely left out most of the research efforts that didn’t reach a mainstream product since there are too many):

  • The discussion has to start with project Crossbow which I believe is the first flow implementation with dedicated H/W resources approach that was available in OpenSolaris in 2007 and finally shipped in Solaris 11 (delayed courtesy the Oracle/Sun merger). The virtual switching in Host and H/W based patents (7613132764348276131987499463, etc) were filed by me and fellow conspirators from 2004 onwards and awarded from 2009 onwards. Keep in mind that when Crossbow had virtual switching with a H/W classifier running in OpenSolaris, Xen etc were just coming out with S/W based bridging. The 2 commands - flowadm and dladm allow users to create Flows and S/W or H/W based virtual NICs that can be assigned to virtual machines. This is the Server Based Approach that ships in main stream OS and is pretty widely deployed.
  • A similar approach has been adopted by our fellow company Nicira in the form of their NVP Architecture. They enhanced the offering by allowing an Openflow based Orchestrator to control the virtual switching in the host although their focus has primarily been on the virtualization side and not so much on application flows side.
  • Another of our sister and partner companies, Big Switch Networks has taken a hybrid approach of orchestrating any Openflow capable device which can be a switch or a virtual switch inside a hypervisor. Since they are still in partial stealth, it would not be my place to talk about more details.
  • Obviously, every existing network vendor claims that they are working on SDN and openflow. But by definition, SDN requires programmability and Operating Systems to run your programs on. Most of the existing network vendors lack the know how or the ability to do this. They have rich bank balances and if they can acquire the right companies and leave them alone, then they can potentially bridge the chasm (although it is going to be painful).
And then its the effort of yours truly at Pluribus Networks. It’s a well kept secret that we are building Server-Switches that run our distributed Netvisor™ which has massive flow capabilities and would be ideal for all the people developing stuff in the SDN space. But then we are in stealth mode and there is much more to us which we will get around to discussing in coming days :-) .

SDN and Openflow- Enabling Network Virtualization in the Cloud: Part I

[Admin note: Sunay would be posting a three part series on SDN and Openflow where he would explain why the networking community has been gripped with excitement about these developments and what implications these developments will have on network virtualization for enterprises, cloud providers and in general, anybody who needs to have policy or rule based networks or needs to upgrade their networking equipment every couple of years or so. Before we dig into Sunay's first part of the series, it might be useful to share links to two useful documents about openflow from www.openflow.org.The first is a whitepaper providing introduction to the protocol, system architecture, and use cases and the second is the Openflow Specification v1.1.0 implementation document. ]

The first article of the series focuses on the protocol itself. The 2nd article will focus on how people are trying to develop it and some end user perspective that I have accumulated in last year or so. The last article in series will discuss the challenges and what are we doing to help.

Value Proposition

The basic piece of Openflow is nothing more than a wire protocol that allows a piece of code to talk to another piece of code. The idea is that for a typical network equipment, instead of logging in and configuring it via its embedded web or command line interface (the way you configure your home wifi router), you can get the Controller from someone other than the equipment vendor. Now technically and in short term, you are probably worse off because you are getting the equipment from one guy and the management interface from other guy and there are bound to be rough edges. [Note: We assume that our goal is getting a better mid-term and long-term ROI and manifold ease of management]

In other words, Openflow creates a standard around how the management interface or Controller talks to the equipment so the equipment vendors can design their equipment without worrying about the management piece and someone else can create a management piece knowing well that it will manage any equipment that support Openflow. So people who understand standards ask whats the big deal? One still can’t do more than what the equipment is designed to do!! And bingo! That’s is the holy grail around any standard. By creating the standard, you are separating the guys who make equipment to focus on their expertise and guys doing management to make the controllers better. This is in no way different from how computers work today. Intel/AMD create the key chips, vendors like Dell, HP etc. create the servers and Linux community (or BSD, OpenSolaris, etc.) create the OS and it all works together offering a better solution. However, and more importantly for any business, it achieves one more thing – it drives the hardware costs lower and creates more competition while allowing the end user to pick the best hardware (from their point of view) and the best controller based on features, reliability, etc. There is no monopoly, plenty of choices and its all great for end user. It especially makes sense in the networking space where innovation had been lacking for a while and few companies have been used to huge margins because users had no other choice. [For a more detailed explanation of how economic theory supports this claim of commoditizing hardware and reducing network switching costs, read Rolf's post on the Economics of Open Networking on the Pluribus Networks blog]

One trend that is driving the fire behind SDN is network virtualization. Both Server and storage side (H/W and OS) have made good progress on this front but Network is still far behind. By opening up this space, SDN has allowed people like me (the OS and Distributed Systems guys) to step into this world and drive the same innovation on network side. Thus, it’s not an overstatement to say that Openflow/SDN are great standards for the end user and for people who understand it, see the power behind it.

Key Features

Openflow Spec 1.1.2 is just out with minor improvements while 1.1.1 has been out for few months. Most of the vendors only have 1.0.0 implemented. So if you look at the specs [Ref: the links above], you will see data structures and message syntax needed for a controller to talk to a device it wants to control. Functionality wise, its can be grouped under following categories (understand that I am trying to help people who don’t want to read hundreds of pages of specs):

  • Device discovery and connection establishment: where you tie in a controller to a device that it wants to control.
  • (Creating the) Flows: In a typical network, there is different type of traffic mixed in, packets for which can be grouped together in the form of a flow. If you look at layer 2 header, packets for the same VLAN can be a flow, packets belonging to a pair of MAC addresses can be a flow and so on. Similarly packets belonging to a IP subnet or IP address plus TCP/UDP port (service) can be termed as a flow. Any combination of Layer 2, 3 and 4 headers that allows us to uniquely identify a packet stream on the wire is termed as flow and Openflow protocol makes special efforts to specify these flows. An Openflow control can specify a flow to a switch which can apply it to specific ports or to all ports and ask the switch to take special actions when it matches a packet to a flow.
  • Action on matching a flow: As part of specifying the flow, the protocol allows the controller to specify what action to take when a packet matches the flow. The action can range from copy the packet, decrement Time to Live, change/add QoS label, etc. But the most important action (in my view) is the ability to direct the original packet (or a copy) to specific port or to the controller itself.
  • Flow Table: Where the flows are created. For an actual device, this is typically the TCAM where the flow is instantiated and applied to incoming packets. Most devices are pretty limited in this and can typically support a very small set of flows today. The protocol allows for specifying multiple tables and the ability to pipeline across those tables but given the state of today’s and mid-term hardware, single table is all we can work with.
  • The last piece is the Counters. Most of the devices support port level counters which the openflow controllers can read. In addition, the protocol supports flow level counters but the current set of devices are very limited in that as well.

Putting it all together

Hopefully, now that we understand the components, we can see how it all works together. A controller (which a piece of code) running on standard server box starts and discovers a device that it wants to manage. In today’s world, that device typically is an ethernet switch. Once connected, it puts the device under its control and sets flows with actions and reads status from the device.

As an example, assume that a user is experimenting with new Layer 3 protocol and s/he can add a flow that makes the switch redirect all matching packets to the controller where the packet gets modified appropriately and redirected through a specific egress port on the device. Much easier to implement since controller itself is a piece of code running on standard OS so adding code to it to do something experimental is pretty straightforward. The most powerful thing here is that the user is not impacting the rest of the network and doesn’t need his/her own dedicated network.

My own favorite (that we have experimented with) is a debugging application for a data center or enterprise where the user needs to debug his own client/server application. The user can try and capture the packets on multiple machines running his clients and server(s) but the easier thing would be to set a flow on the switch based on server IP address and TCP port (for the service) and an action that allows a copy of all matching packets to be sent to the controller with a timestamp. This allows the user to debug his application much more easily.

Again, the important thing to remember is that the power of Openflow and Software Defined Networking is in allowing people to innovate and enabling someone to solve their problem by writing simple code (or use code provided by others). Its important to keep in mind that it is a switch that is a powerful device since everything goes through it and allowing it to be controlled by C, Java, or Perl code is empowering it even more. Eventually, the control moves from the switch designer to application developers (to the discomfort of the switch vendors :)

So finally, how does it help Network Virtualization and Cloud?

This is the reason why I am so excited and ended up spending time writing the blog. The key premise in the world of virtualization is dynamic control for resource utilization. Again, network utilization and SLAs are important but the key part we need to solve is the utilization of servers. The holy grail is a large pool of servers each running 20-50 virtual machines that are controlled by software which optimizes CPU/memory utilization. The key issue here is that the Virtual Machines are grouped together in terms of applications they run or the application developer who controls them. To prevent free for all, they typically are tied together with some VLAN, ACL code, have a network identity in terms of IP/MAC addresses, and SLA/QoS etc. For the controlling Software to migrate the VM freely, it needs to manage the VM network parameters on the target switch port as well. And this is where the current generation of switches fail.

At present, network switches require human intervention to configure the various network parameters on the switch that match the VM. So in order for a VM to migrate freely under software control, it still requires human intervention on the network side. With Openflow, the Software orchestrating the server utilization by scheduling the VMs based on policies/SLAs, can set the matching network policies without human intervention.

Just the way a typical server OS has a policy driven scheduler which control the various application threads on dozens of CPUs (even a low end dual socket server has 6 core each with multiple hardware threads), the Openflow allows us to build a combined server/storage/network scheduler that can optimize the VM placement based on configured policies.

Again, Openflow is just a wire protocol and a pseudo standard but it allows people like me to add huge value which wasn’t possible before. In the next article, we will go deeper into what people are trying to build and look at some more specific use cases. Stay Tuned and Happy Holidays!!

Originally posted at Sunay’s personal blog.