Monday, January 12, 2009

Data Center Ethernet: A Solution For Unified Fabric

There's a lot of buzz these days in the network world around the "unified fabric". To put it simply, unified fabric refers to collapsing all the various connectivity mechanisms/technologies onto a single transport medium. Traditionally in large data center environments, you may have the following different segments connecting various technologies and/or serving various functional purposes:

  • Front End Network. This is basically the primary public facing network and the one connecting to the Internet or large corporate network.
  • Storage Network. This is the network connecting all the servers to the various storage arrays. You'll primarily see fiber channel technology being used and specialized SAN fabric switches used for connectivity.
  • Back End Network. This is generally the network connecting all the servers to the various backup solutions providing backup/restore services to the environment.
  • Management Network. This is a back end network of sorts but has a primary focus of providing management services to the environment. A management network typically provides the transport used by the management entity or NOC to gain remote access to the environment.
  • InfiniBand Network. This network uses InfiniBand technology for high performance clustered computing.
Unified fabric refers to the possibility of collapsing all of these various networks onto a single network using more advanced technologies, fewer physical connections and higher speeds. The following diagram shows a simple server connected to all the various networks and then what the view would look like in the unified fabric model:
It would surely make life much simpler from an administrative standpoint and there are several obvious cost advantages by making this a reality (fewer NICs, HBAs, cabling, fiber, switch types, etc.). Cisco's Data Center Ethernet promises to help do just that.

Ethernet is already the predominant choice for connecting network resources in the data center. It's ubiquitous, understood by engineers/developers around the globe and has stood the test of time against several challenger technologies. Cisco has basically built upon what we know as Ethernet. Cisco's Data Center Ethernet is basically a collection of Ethernet design technologies designed to improve and expand the role of Ethernet networking in the data center. These technologies include:

  • Priority-based Flow Control (PFC). PFC is an enhancement to the pause mechanism within Ethernet. The current Ethernet pause option stops all the traffic on the entire link to control traffic flow. PFC creates eight virtual links on each physical link and allows any of these links to be paused independent of the others. When I was at Cisco a few weeks ago we covered this in detail. As a joke I asked them if they had 8 channels because Ethernet cabling has 8 separate wires (I know, networking jokes are quite complicated). They laughed and said that was in fact where the idea came from. They originally intended to send each virtual link down a different physical wire. It was funny for all of us anyway.
  • Enhanced Transmission Selection. PFC can create eight distinct virtual link types on a physical link, and it can be advantageous to have different traffic classes defined within each virtual link. Traffic within the same PFC class can be grouped together and yet treated differently within each group. Enhanced Transmission Selection (ETS) provides prioritized processing based on bandwidth allocation, low latency, or best effort, resulting in per-group traffic class allocation.
  • Data Center Bridging Exchange Protocol (DCBX). Data Center Bridging Exchange (DCBX) Protocol is a discovery and capability exchange protocol that is used by Cisco DCE products to discover peers in Cisco Data Center Ethernet networks and exchange configuration information between Cisco DCE switches.
  • Congestion Notification. Traffic management that pushes congestion to the edge of the network by instructing rate limiters to shape the traffic causing the congestion.
There's a lot more to it but those are the foundational differences that make this a very interesting technology for the future data center. Ethernet is the obvious choice for a single, converged fabric that can support multiple traffic types. Cisco Data Center Ethernet offers the flexibility to choose what to run over a consolidated interface, link, and switch fabric and when to make that move. We should be seeing a lot more of this as the push towards the unified fabric continues.

Monday, January 5, 2009

Cisco's Virtual Switching System (VSS)

I first learned about Cisco's Virtual Switching System (VSS) back at the 2008 Cisco Networker's conference in Orlando. Now that it's officially in production and there are several successful case studies, I figured it was the appropriate time to write about it.

VSS brings a much greater depth of virtualization to the switching layer and actually helps solve many of the challenges Network Engineers face when building out large switching environments. To put it in a nutshell, VSS allows an engineer to take two Cisco Catalyst 6500 series switching platforms and "virtualize" or collapse them so they appear to be one switch. Here is a basic diagram for how your layer two environment would normally appear for a highly redundant data center switching architecture:



VSS basically allows you to collapse each of these switch pairs into a virtual switch so your new high level architecture would look like this:


From an engineering standpoint, this has several advantages and the new VSS technology combined with other technologies allows us to:

1) Cut down on the number of ports and links that are serving in a passive only capacity. In the past, one set of the redundant links were automatically implemented in a passive state thanks to Spanning Tree Protocol (STP). STP was originally designed to help prevent loops in the network but is now commonly implemented as a redundancy mechanism. By using VSS and another technology called Multichassis Etherchannel, you can basically get to a state where all links are active and passing traffic. This increases capacity, allows you to use ports and links previously unusable unless a failover occurred and allows you to cut down on the number of ports. Active/passive architectures only use 50% of available capacity, adding considerable expense to the project.

2) Manage fewer network elements. With VSS we get one control plane for each VSS cluster so these appear as one switch. One switch to manage instead of two has it's obvious benefits.

3) Use all NICs on the servers in the infrastructure. As with the combined links between the switches, you can also have combined links for the servers using the same technology. Multichassis Etherchannel (MEC) allows you to connect a server to two physically seperate switches and use both connections for an active/active implementation. This is a pretty big step in that we immediately double the capacity of our servers with the same number of NICs while still providing the highest level of redundancy.

4) Build bigger backbones. The performance of a VSS cluster is almost exactly twice that of one standalone 6500 platform. You might think this would be obvious being that you're using two switches so of course performance should be twice as much. But the fact that Cisco has managed to pull it off across two physically seperate chassis that appear as one logical switch is pretty amazing...at least from this engineer's perspective.

I think that's a good summary but this is some exciting stuff and will allow all of us engineers who have had our share of STP issues over the years some hope for the future!