SDN and Openflow- Enabling Network Virtualization in the Cloud: Part II

Using Openflow – state of the ART

In an earlier article, we discussed the components of Openflow and building blocks of a Software Defined Network. In this part, we’ll discuss some interesting things being done today to make it all work. However, before we do that, it’d be worthwhile to first discuss the concept or definition of a flow and why is it important.

 

What is a Flow and the split between Hardware and Software

A flow is a simple mechanism to identify a group of packets on the wire. So a packet coming from a particular machine can be identified by the machine’s MAC or IP address which appears as source MAC in L2 header or source IP in L3 header. By putting a flow rule around either of those fields and just counting the packets going through the switch that satisfy that rule, we can determine the number of packets being sent by the machine. This is useful information. To make it even more useful, one could add another flow to measure the packets going to our target machine. Adding a destination MAC or destination IP rule based on the machine’s MAC or IP address will accomplish that and using our two flows, we can now find out how many packets are coming and going out of the machine. The next question is, who is implementing the rules we just discussed above. There are several approaches and advantages/disadvantages of each approach, which I have discussed below:

  • Server Based Approach - Have S/W in the server itself to do the implementation of the rules. It’s the easiest way since the server already has to process the packets and it can keep track. The issue with this approach arises when it’s not a real server but a virtual machine on the server that we want to track. We can still let the hypervisor track the packets or ask the Virtual Machine to track it. The big disadvantage of this approach is that asking the server to do stuff on your behalf needs certain trust (security holes and digital certificates come to mind), depends on its capability and lowers performance. Since the server hypervisor has to measure these things, it needs to see the packets, making hardware based virtualization (SR-IOV) hard to adopt. Most of the data center bridging standards and IO virtualization standards are today going towards hardware based switch in the server and doing things in the S/W layers of the server is not going to be possible.
  • Server based with H/W offload - There is more talk around this than any real implementation but its worth mentioning that people have discussed putting special capabilities in the NICs on server to offload some flow processing. The advantage is performance and security (since the Hypervisor controls the NIC, the Virtual Machine can’t circumvent it). The disadvantage is cost and scaling issues. The chips capable of doing this (TCAMs etc) are expensive and trying to orchestrate across large numbers of servers severely limits the scale. We are already seeing Intel Sandy Bridge architecture coming to life which is integrating 10GigE NICs. Adding TCAMs will only increase the basic cost by $800-900 and also add significant complexity.
  • Probe Based Approach - Have probes in the network to do it. There are companies  which specialize in inserting probes in the network and collecting data that can do this quite well, as long as you only want to observe things. If redirection or traffic shaping action is needed, these passive probes will not work and inserting them requires intrusive work in cabling etc. Not my favorite approach either.
  • Switch based approach - Since all the traffic passes through the switches anyway, having them do it makes a lot more sense. The modern switch chips have H/W based CAMs and TCAMs which can take a rule and do the needful without adding to the latency or throughput of the packet stream. In my past life, as Architect of Solaris Networking and Network Virtualization, I have done the software based approach, but given the growing Virtual machine density, SR-IOV type features and growing need for analytics and traffic shaping with performance, I think that the switch based approach is far superior. Here, the CAM and TCAM that measure flows are the Hardware pieces. The software piece is able to add and delete rules on the fly. And Openflow provides a pseudo standard that allows a programmer to work and program any switch. But the biggest advantages are scale, ease of use, and administrative separation of this approach. The scale comes from orchestrating your flows and policies across fewer devices (one switch for approx 50 servers). Also, the people in charge of networks and storage networks are at times different and keeping the administrative separation is useful although not required.
So needless to say, we have currently taken the approach of solving this problem on the switch in conjunction with coordinating with the host, using standards like EVB/DCB etc which we will discuss at a later time. Given that new generation switch chips are very similar to the server CPU and also have the same complexity, the problem calls for a real Operating System on top, which can then help us write openflow based applications. This is where we step in. Part of Pluribus Networks’ effort is around implementing a distributed network hypervisor (called Netvisor™, a key component of the nvOS™ operating system) to give openflow programmers real teeth. We treat any switch chip the same as a server chip and most of the code is platform independent with very little that is written to the chip’s instruction set. Just the same way Linux code (with little platform specific stuff) runs on x86 or Power and OpenSolaris codes run on x86 and Sparc.

 

Current implementations

Now a little overview of projects and people who are leading the charge in the brave world of flows and Software Defined Networking. Before raking me over coal on the missing things, let me clarify that the stuff below is what I consider mainstream implementations that apply in world of data centers today (Disclaimer: I have purposely left out most of the research efforts that didn’t reach a mainstream product since there are too many):

  • The discussion has to start with project Crossbow which I believe is the first flow implementation with dedicated H/W resources approach that was available in OpenSolaris in 2007 and finally shipped in Solaris 11 (delayed courtesy the Oracle/Sun merger). The virtual switching in Host and H/W based patents (7613132764348276131987499463, etc) were filed by me and fellow conspirators from 2004 onwards and awarded from 2009 onwards. Keep in mind that when Crossbow had virtual switching with a H/W classifier running in OpenSolaris, Xen etc were just coming out with S/W based bridging. The 2 commands - flowadm and dladm allow users to create Flows and S/W or H/W based virtual NICs that can be assigned to virtual machines. This is the Server Based Approach that ships in main stream OS and is pretty widely deployed.
  • A similar approach has been adopted by our fellow company Nicira in the form of their NVP Architecture. They enhanced the offering by allowing an Openflow based Orchestrator to control the virtual switching in the host although their focus has primarily been on the virtualization side and not so much on application flows side.
  • Another of our sister and partner companies, Big Switch Networks has taken a hybrid approach of orchestrating any Openflow capable device which can be a switch or a virtual switch inside a hypervisor. Since they are still in partial stealth, it would not be my place to talk about more details.
  • Obviously, every existing network vendor claims that they are working on SDN and openflow. But by definition, SDN requires programmability and Operating Systems to run your programs on. Most of the existing network vendors lack the know how or the ability to do this. They have rich bank balances and if they can acquire the right companies and leave them alone, then they can potentially bridge the chasm (although it is going to be painful).
And then its the effort of yours truly at Pluribus Networks. It’s a well kept secret that we are building Server-Switches that run our distributed Netvisor™ which has massive flow capabilities and would be ideal for all the people developing stuff in the SDN space. But then we are in stealth mode and there is much more to us which we will get around to discussing in coming days :-) .

Network 2.0: Network Virtualization without Limits

So the theme of the day is Network Virtualization, software defined networks (SDN) and taking virtualization to its logical conclusion i.e. server, storage and network in a giant resource pool that can be allocated/assigned any which way. But easier said than done. Server and Storage virtualization were a bit simpler since we were dealing with only a single OS that needed to provide the right abstraction layer. The H/W resource pool (disk, cpu, network, memory, etc) was managed by the single OS, as such provisioning it between various virtual machines or storage pool was simpler. The network, by definition is useful only when multiple devices are connected but trying to treat them as a single resource pool is much harder. A virtual network has to deal with not just links, bandwidth, latency and queues but also higher level functionality like routing, load balancing, firewalling, DNS, DHCP, VPN, and numerous other services. And we haven’t yet talked about how all this will have to hook up together along with virtual machines and virtual storage pool in an implementation-easy manner. Now before you argue that every component is already virtualized (which is true and so why do we really need network virtualization), one could also argue that it still doesn’t give us a complete virtual network. Think of it as someone waiting for dinner but is instead served raw potatoes, onions, tomatoes, eggs, frozen meat (and the condiments) and shown the stove to make his own main course!

The real reason why network virtualization is a tough problem to solve is due to two essential requirements of switching: very high performance and ultra-low latency. These force all the switching functionality to be inside a very highly complicated ASIC which does all the hard work in shuffling 1.2 Terabits per seconds of data and sub micro second latencies and hence doesn’t need much software on top. The embedded OS controlling the switch is mostly used for just configuring the switch chip using a CLI (command line interface) that allows the administrator to control and configure each component on the switch but almost nothing else. So when we started playing with some of the prototype next generation boxes that our friends at Fulcrum and Broadcom gave us, we just kept asking ourselves whether we could have a real OS running the chip to be empowered to do something more useful to achieve the elusive goal of complete network vritualization. We even asked our friends to see if there was someway for them to put a full fledged OS on top of these chips (being the OS person I have been for most of my life :-) ).

And that was when I realized that to solve the network virtualization problem, we really need an OS that understand resource pools and virtualization on the chip. But a single switch by itself is not very interesting so we need an OS that controls all the switches. Hmmmm – one OS that controls them all (borrowing from LOTR which reminds me to ask Peter Jackson whatever happened to the prequel!!). So before we can even start building anything more complicated, we need to build a network hypervisor that has semantics similar to a tightly coupled cluster but controls a collection of switches and scales from one instance to hundred plus instances.

Having pioneered virtual switching and resource control in the server OS (Solaris to be specific – the project, Crossbow that I started in 2003 got integrated in OpenSolaris in 2008), I eventually set out to do the same for larger networks in the form of Pluribus Networks Inc and apply the hard lessons learned from enterprise customers. This is what we at Pluribus call, “Network 2.0 or Network Virtualization without Limits”.

The Network OS is finally taking life and is able to treat the network exactly as a one giant resource pool. A note of caution though- please don’t confuse the Network OS with typical management layer that manages a collection of devices. We do still need a management layer to configure and manage the OS but the policy enforcement, congestion control and resource management across all devices is done by the OS. It is the same as a server cluster that doesn’t get rid of the management layer but actually gives the management layer something that is more manageable.

The post was originally posted by Sunay on his personal blog.