Saturday, June 20, 2009

Nexus 2148 Blocks BPDUs

We ran into a small issue this week with the cloud. We can easily overcome it but it seems that Cisco has a few design flaws to work out with the Nexus 2148, or fabric extenders which they refer to them as.

As you know from previous posts, we've launched our cloud offering and we've had very good public response from it. We've got about a dozen customers provisioned on it now and the requirement has emerged (we knew it would) to connect a customer cloud environment to their dedicated environment. In this particular case, Oracle RAC isn't supported on VMWare (go figure). We have a customer that wants to provision all of their front end web servers in the cloud but use an Oracle RAC implementation in the back end. So basically what we need to do is to build a RAC environment in a dedicated space for the customer and have it connect to the customer's cloud infrastructure.

Initially, we were just going to have the customer's RAC environment sit in a dedicated rack somewhere in the data center. We were going to run a couple of cross connects to the cage where our cloud infrastructure sits and then connnect it into the Cisco Nexus 2148 fabric extenders since those provide 48 gig copper ports. In the dedicated environment, the customer was going to have a couple of switches for their RAC servers and we'd just connect those switches to the 2148s.....all would be good.

Then we found the problem. The 2148 blocks BPDU (bridge protocol data unit) packets so we can't connect another switch to them. It will send the port into err-disable. The problem is that BPDUs are blocked by default and you can't change this setting. Rats!!! I'm sure Cisco has some great rational explanation for why they did it (like they're trying to get rid of spanning tree or something) but I think it's so you have to take the connection up to the Nexus 5020 or 7000 series and burn your 10G SFP slots. More 1oG ports get burned...Cisco makes more money. I know, I know.....I may be being harsh on Cisco and somewhat understand their rationale but it makes life a little more difficult.

We ultimately decided to just provision a couple of ToR switches (Cisco 3750 or 4948) to service this need. No big deal but it would have been nice to be able to use the ports on those 2148s as we have a lot that aren't going to be used for other purposes.

Note that we did verify you can't disable this feature. Documentation is hard to find to support this but we fully tested it in the cloud lab so if you're out there racking your brain trying to figure this out, give it up for now.