28
Aug/10
0

What a mouthful

TechOps Guy: Nate

I’ve thought about this off and on and I better write about it so I can forget about it.

I think Force10 is way too verbose in choosing the phrase to describe their company, it’s quite a mouthful -

Force10 Networks, Inc., a global technology leader that data center, service provider and enterprise customers rely on when the network is their business[..]

I like Force10, I have been watching them for five years now, I just think any phrase you choose to describe your company should be short enough to say it in one (casual) breath.

How about “Force10 Networks Inc., a global networking technology leader”.

Force10’s marketers are very nice folks I’ve sent them two corrections over the years to their web site(one concerning the number of ports a competitor offers in their products, the other with a math error in a graphic showing much you can save on their products), they were very kind and responsive(and fixed both problems pretty quickly too). This one I won’t send to them directly since it’s more than a cosmetic change :)

25
Aug/10
1

Moving on up to Number two

TechOps Guy: Nate

Brings a tear to me eye, my favorite switching vendor had a pretty impressive announcement today:

Extreme Networks commanded the #2 revenue position for data center Top-of-Rack switches according to the quarterly Ethernet market share report, behind only Cisco, driven by its industry leading Summit(R) X650, Summit X450 and Summit X480 switches. In the “Top of Rack” switch port shipment category, Extreme Networks increased its port shipments by 194% compared to the same quarter one year ago. This demonstrates continued momentum for the Company in the dynamic and demanding data center Ethernet market.

If you haven’t already seen the X650, X480 and even X450 Series of switches check them out. They do offer several capabilities that no other vendor on the market provides. And they are very affordable.

I have blogged on some of my more favorite topics in the past, with regards to their technology. I’ve been using Extreme stuff for just about 10 years now I think.

[tangent -- begin]

I remember the 2nd switch I bought(this one for my employer), a Summit 48 with an external power supply I think it was in 2001. Bought it off Ebay from what I assume was a crashed dot com or something. Anyways they didn’t include the cable(sold “as is”) to connect the switch to the redundant power supply. So I hunted around trying to find what part to order, couldn’t find anything. So I called support.

The support tech had me recite the serial# of the unit to him, and he said they don’t have a part# for that cable, so they couldn’t sell me one. But he happened to have a few cables laying around so he put one in a fedex pouch and shipped it to me, free. I didn’t have a support contract(and didn’t get a support contract until I made a much larger purchase several years later). But I guess you could say that friendly support engagement certainly played a factor in me keeping tabs on the company and the products going forward, leading up to a million dollar purchase several years later(different company) of more than 3,000 ports.

I used my first switch, also Summit 48 as my home network switch for a good 5 years, before I decided it drew too much power for what I needed(48 port switch running on maybe 5-6 ports total), and was pretty noisy(as are pretty much all switches from that era, I think it was manufactured in ‘98).  Got a good deal on a Summit 48si, and upgraded to that! For another year, and then retired it to a shelf. It drew half the power, and after replacing all of the fans in the unit(original fans too loud) it was quieter, but my network needs shrank even more from ~5-6 systems to ~2-3 (yay VMware), and I wanted to upgrade to gigabit.

From the Summit 48 article above, I thought this is a good indication on how easy their stuff is to use, even more than 10 years ago:

[..]We tested it with and without the QoS enabled. Without the QoS enabled, I began to see glitches in the video. The video halted abruptly at rates over 98 percent. With two commands, I enabled QoS on the Summit switches. Summit48 intelligently discarded the packets with lower priority, preserving the video stream’s quality even at 100 percent utilization.

Eventually recycled my Summit 48, along with an old Cisco switch(which I never used), couple really old Foundry load balancers(never used them either) a couple of years ago. Was too lazy to try to ebay them or put them on craigslist. Still have my 48si, it’s a really nice switch I like it a lot, they still sell it in fact even today. And still release updates(ExtremeWare 7.x) for it. The Summit 48 code base(ExtremeWare 1.x-4.x) was retired probably in 2002, so nothing new released for it for a long time.

[tangent -- end]

So, congratulations Extreme for doing such a great job.

Filed under: Networking
23
Aug/10
0

HP FlexFabric module launched

TechOps Guy: Nate

While they announced it a while back, it seems the HP VirtualConnect FlexFabric Module available for purchase for $18,500 (web price). Pretty impressive technology, Sort of a mix between FCoE and combining a Fibre channel switch and a 10Gbps Flex10 switch into one. The switch has two ports on it that can uplink (apparently) directly fiber channel 2/4/8Gbps. I haven’t read too much into it yet but I assume it can uplink directly to a storage array, unlike the previous Fibre Channel Virtual Connect module which had to be connected to a switch first (due to NPIV).

HP Virtual Connect FlexFabric 10Gb/24-port Modules are the simplest, most flexible way to connect virtualized server blades to data or storage networks. VC FlexFabric modules eliminate up to 95% of network sprawl at the server edge with one device that converges traffic inside enclosures and directly connects to external LANs and SANs. Using Flex-10 technology with Fibre Channel over Ethernet and accelerated iSCSI, these modules converge traffic over high speed 10Gb connections to servers with HP FlexFabric Adapters (HP NC551i or HP NC551m Dual Port FlexFabric 10Gb Converged Network Adapters or HP NC553i 10Gb 2-port FlexFabric Converged Network Adapter). Each redundant pair of Virtual Connect FlexFabric modules provide 8 adjustable connections ( six Ethernet and two Fibre Channel, or six Ethernet and 2 iSCSI or eight Ethernet) to dual port10Gb FlexFabric Adapters. VC FlexFabric modules avoid the confusion of traditional and other converged network solutions by eliminating the need for multiple Ethernet and Fibre Channel switches, extension modules, cables and software licenses. Also, Virtual Connect wire-once connection management is built-in enabling server adds, moves and replacement in minutes instead of days or weeks.

[..]

  • 16 x 10Gb Ethernet downlinks to server blade NICs and FlexFabric Adapters
  • Each 10Gb downlink supports up to 3 FlexNICs and 1 FlexHBA or 4 FlexNICs
  • Each FlexHBA can be configured to transport either Fiber Channel over Ethernet/CEE or Accelerated iSCSI protocol.
  • Each FlexNIC and FlexHBA is recognized by the server as a PCI-e physical function device with adjustable speeds from 100Mb to 10Gb in 100Mb increments when connected to a HP NC553i 10Gb 2-port FlexFabric Converged Network Adapter or any Flex-10 NIC and from 1Gb to 10Gb in 100Mb increments when connected to a NC551i Dual Port FlexFabric 10Gb Converged Network Adapter or NC551m Dual Port FlexFabric 10Gb Converged Network Adapter
  • 4 SFP+ external uplink ports configurable as either 10Gb Ethernet or 2/4/8Gb auto-negotiating Fibre Channel connections to external LAN or SAN switches
  • 4 SFP+ external uplink ports configurable as 1/10Gb auto-negotiating Ethernet connected to external LAN switches
  • 8 x 10Gb SR, LR fiber and copper SFP+ uplink ports (4 ports also support 10Gb LRM fiber SFP+)
  • Extended list of direct attach copper cable connections supported
  • 2 x 10Gb shared internal cross connects for redundancy and stacking
  • HBA aggregation on FC configured uplink ports using ANSI T11 standards-based N_Port ID Virtualization (NPIV) technology
  • Allows up to 255 virtual machines running on the same physical server to access separate storage resources
  • Up to 128 VLANs supported per Shared Uplink Set
  • Low latency (1.2 µs Ethernet ports and 1.7 µs Enet/Fibre Channel ports) throughput provides switch-like performance.
  • Line Rate, full-duplex 240Gbps bridging fabric
  • MTU up to 9216 Bytes – Jumbo Frames
  • Configurable up to 8192 MAC addresses and 1000 IGMP groups
  • VLAN Tagging, Pass-Thru and Link Aggregation supported on all uplinks
  • Stack multiple Virtual Connect FlexFabric modules with other VC FlexFabric, VC Flex-10 or VC Ethernet Modules across up to 4 BladeSystem enclosures allowing any server Ethernet port to connect to any Ethernet uplink

Management

  • Pre-configure server I/O configurations prior to server installation for easy deployment
  • Move, add, or change server network connections on the fly without LAN and SAN administrator involvement
  • Supported by Virtual Connect Enterprise Manager (VCEM) v6.2 and higher for centralized connection and workload management for hundreds of Virtual Connect domains. Learn more at: www.hp.com/go/vcem
  • Integrated Virtual Connect Manager included with every module, providing out-of-the-box, secure HTTP and scriptable CLI interfaces for individual Virtual Connect domain configuration and management.
  • Configuration and setup consistent with VC Flex-10 and VC Fibre Channel Modules
  • Monitoring and management via industry standard SNMP v.1 and v.2 Role-based security for network and server administration with LDAP compatibility
  • Port error and Rx/Tx data statistics displayed via CLI
  • Port Mirroring on any uplink provides network troubleshooting support with Network Analyzers
  • IGMP Snooping optimizes network traffic and reduces bandwidth for multicast applications such as streaming applications
  • Recognizes and directs Server-Side VLAN tags
  • Transparent device to the LAN Manager and SAN Manager
  • Provisioned storage resource is associated directly to a specific virtual machine – even if the virtual server is re-allocated within the BladeSystem
  • Server-side NPIV removes storage management constraint of a single physical HBA on a server blade Does not add to SAN switch domains or require traditional SAN management
  • Centralized configuration of boot from iSCSI or Fibre Channel network storage via Virtual Connect Manager GUI and CLI
  • Remotely update Virtual Connect firmware on multiple modules using Virtual Connect Support Utility 1.5.0

Options

  • Virtual Connect Enterprise Manager (VCEM), provides a central console to manage network connections and workload mobility for thousands of servers across the datacenter
  • Optional HP 10Gb SFP+ SR, LR, and LRM modules and 10Gb SFP+ Copper cables in 0.5m, 1m, 3m, 5m, and 7m lengths
  • Optional HP 8 Gb SFP+ and 4 Gb SFP optical transceivers
  • Supports all Ethernet NICs and Converged Network adapters for BladeSystem c-Class server blades: HP NC551i 10Gb FlexFabric Converged Network Adapters, HP NC551m 10Gb FlexFabric Converged Network Adapters, 1/10Gb Server NICs including LOM and Mezzanine card options and the latest 10Gb KR NICs
  • Supports use with other VC modules within the same enclosure (VC Flex-10 Ethernet Module, VC 1/10Gb Ethernet Module, VC 4 and 8 Gb Fibre Channel Modules).

So in effect this allows you to cut down on the number of switches per chassis from four to two, which can save quite a bit. HP had a cool graphic showing the amount of cables that are saved even against Cisco UCS but I can’t seem to find it at the moment.

The most recently announced G7 blade servers have the new FlexFabric technology built in(which is also backwards compatible with Flex10).

VCEM seems pretty scalable

Built on the Virtual Connect architecture integrated into every BladeSystem c-Class enclosure, VCEM provides a central console to administer network address assignments, perform group-based configuration management and to rapidly deployment, movement and failover of server connections for 250 Virtual Connect domains (up to 1,000 BladeSystem enclosures and 16,000 blade servers).

With each enclosure consuming roughly 5kW with low voltage memory and power capping, 1,000 enclosures should consume roughly 5 Megawatts? From what I see “experts” say it costs roughly ~$18 million per megawatt for a data center, so one VCEM system can manage a $90 million data center, that’s pretty bad ass. I can’t think of who would need so many blades..

If I were building a new system today I would probably get this new module, but have to think hard about sticking to regular fibre channel module to allow the technology to bake a bit more for storage.

The module is built based on Qlogic technology.

26
Apr/10
0

40GbE for $1,000 per port

TechOps Guy: Nate

It seems it wasn’t too long ago that 10GbE broke the $1,000/port price barrier. Now it seems we have reached it with 40GbE as well, from my own personal favorite networking company Extreme Networks, announced today the availability of an expansion module for the X650 and X480 stackable switches to include 40GbE support. Top of rack line rate 10GbE just got more feasable.

LAS VEGAS, NV, Apr 26, 2010 (MARKETWIRE via COMTEX News Network) — Extreme Networks, Inc. (NASDAQ: EXTR) today announced highly scalable 40 Gigabit Ethernet (GbE) network solutions at Interop Las Vegas. The VIM3-40G4X adds four 40 GbE connections to the award-winning Summit(R) X650 Top-of-Rack stackable switches for $3,995, or less than $1,000 per port. The new module is fully compatible with the existing Summit X650 and Summit X480 stackable switches, preserving customers’ investments while providing a smooth upgrade to greatly increased scalability of both virtualized and non-virtualized data centers.

[..]

Utilizing Ixia’s IxYukon and IxNetwork test solutions, Extreme Networks demonstrates wire-speed 40Gbps performance and can process 60 million packets per second (120Mpps full duplex) of data center traffic between ToR and EoR switches.

19
Apr/10
1

Arista ignites networks with groundbreaking 10GbE performance

TechOps Guy: Nate

In a word: Wow

Just read an article from our friends at The Register on a new 384-port chassis 10GbE switch that Arista is launching. From a hardware perspective the numbers are just draw dropping.

A base Arista 7500 costs $140,000, and a fully configured machine with all 384 ports and other bells and whistles runs to $460,800, or $1,200 per port. This machine will draw 5,072 watts of juice and take up a little more than quarter of a rack.

Compare this to a Cisco Nexus 7010 setup to get 384 wirespeed ports and deliver the same 5.76 Bpps of L3 throughput, and you need to get 18 of the units at a cost of $13.7m. Such a configuration will draw 160 kilowatts and take up 378 rack units of space – nine full racks. Arista can do the 384 ports in 1/34th the space and 1/30th the price.

I love the innovation that comes from these smaller players, really inspiring.

Filed under: Networking, News
8
Apr/10
1

What can you accomplish in two microseconds?

TechOps Guy: Nate

An interesting post on the Datacenter Knowledge site about the growth in low latency data centers, the two things that were pretting shocking to me at the end were:

“I still find it amazing,” said McPartland. “A blink of an eye is 300 milliseconds. That’s an eternity in this business.”

How much of an eternity: “You can do a heck of a lot in 2 microseconds,” said Kaplan.

Interesting the latency requirements these fast stock traders are looking for, reminded me of a network upgrade the NYSE did deploying some Juniper stuff a while back as reported by The Register:

With the NewYork Stock Exchange down on Wall Street being about ten miles away from the data center in New Jersey, the delay between Wall Street and the systems behind the NYSE is about 105 microseconds. This is not a big deal for some trading companies, but means millions of dollars for others.

[..]

NYSE Technologies, which is the part of the company that actually hooks people into the NYSE and Euronext exchanges, has rolled out a market data system based on the Vantage 8500 switches. The system offers latencies per core switch in the range of 25 microseconds for one million messages per second on messages that are 200 bytes in size.

The Vantage 8500 switch seems pretty scalable, claiming to have non blocking scalability of 10GbE for up to 3,400 servers, announced last year.

Arista Networks somewhat recently launched an initiative aimed at this market segment as well.

Since the Juniper announcement, Force10 announced that the NYSE has chosen their gear for the next generation data centers at the NYSE, the Juniper switching gear so far hasn’t looked all that great compared to the competition, so I’d be curious how the deployment of Force10 stuff relates to the earlier deployment of Juniper stuff:

SAN JOSE, Calif., November 9, 2009 – Force10 Networks, Inc., the global technology leader that data center, service provider and enterprise customers rely on when the network is their business, today announced that the NYSE Euronext has selected its high-performance 10 Gigabit Ethernet (10 GbE) core and access switches to power the management network in their next-generation data centers in the greater New Jersey and London metro areas.

Force10 of course has been one of the early innovators and leaders in 10GbE port density and raw throughput(at least on paper, I’ve never used their stuff personally though have heard good things). On a related note it wasn’t long ago that they filed for an IPO, I wish them the best, as Force10 really is an innovative company and I’ve admired their technology for several years now.

(how do I remember all of these news articles?)

Filed under: Networking
17
Mar/10
1

Frightened

TechOps Guy: Nate

Frightened. That was the word that first came to my mind when I read this article from our friends at The Register.

The report also says that 60 per cent of Google’s traffic is now delivered directly to consumer networks. In addition to building out a network of roughly 36 data centers and co-locating in more than 60 public exchanges, the company has spent the past year deploying its Google Global Cache (GGC) servers inside consumer networks across the globe. Labovitz says that according to Arbor’s anecdotal conversations, more than half of all consumer providers in North American and Europe now have at least one rack of Google’s cache servers.

Honestly, I am speechless beyond the word frightened, you may want to refer to an earlier blog post “Lesser of two Evils” for more details.

9
Mar/10
7

Yawn..

TechOps Guy: Nate

I was just watching some of my daily morning dose of CNBC and they had all these headlines about how Cisco was going to make some earth shattering announcement(“Change the internet forever”), and then the announcement hit, some new CRS-1 router, that claimed 12x faster performance than the competition. So naturally I was curious. Robert Paisano on the floor of the NYSE was saying how amazing it was that the router could download the library of congress in 1 second(he probably didn’t understand the router would have no place to put it).

If I want a high end router that means I’m a service provider and in that case my personal preference would be for Foundry Networks (now Brocade). Juniper makes good stuff too of course though honestly I am not nearly as versed in their technology. Granted I’ll probably never work for such a company as those companies are really big and I prefer small companies.

But in any case wanted to illustrate (another) point. According to Cisco’s own site, their fastest single chassis system has a mere 4.48 terrabits of switching capacity. This is called the CRS-3, which I don’t even see listed as a product on their site, perhaps it’s yet to come. The biggest, baddest product they have on their site right now is a 16-slot CRS-1. This according to their own site, has a total switching capacity of a paltry 1.2Tbps, and even worse a per-slot capacity of 40Gbps (hello 2003).

So take a look at the Foundry Networks (the Brocade name makes me shudder, I have never liked them) , their NetIron XMR series. From their documentation the “total switching fabric”, ranges from 960 gigabits on the low end to 7.68 terrabits on the high end. Switch forwarding capacity ranges from 400 gigabits to 3.2 terrabits. This comes out to 120 gigabits of full duplex switch fabric per slot (same across all models). While I haven’t been able to determine precisely how long XMR has been on the market I have found evidence that it is at least nearly 3 years old.

To put it in another perspective, in a 48U rack with the new CRS-3 you can get 4.48 terrabits of switching fabric(1 chassis is 48U). With Foundry in the same rack you can get one XMR32k and one XMR16k(combined size 47U) for a total of 11.52 terrabits of switching fabric. More than double the fabric in the same space, from a product that is 3 years old. And as you can imagine in the world of IT, 3 years is a fairly significant amount of time.

And while I’m here and talking about Foundry and Brocade take a look at this from Brocade, it’s funny it’s like something I would write. Compares the Brocade Director switches vs Cisco (“Numbers don’t lie”). One of my favorite quotes:

To ensure accuracy, Brocade hired an independent electrician to test both the Brocade 48000 and the Cisco MDS 9513 and found that the 120 port Cisco configuration actually draws 1347 watts, 45% higher than Cisco’s claim of 931 watts. In fact, an empty 9513 draws more electrical current (5.6 amps) than a fully-populated 384 port Brocade 48000 (5.2 amps). Below is Brocade’s test data. Where are Cisco’s verified results?

Another

With 33% more bandwidth per slot (64Gb vs 48Gb), three times as much overall bandwidth (1.5Tb vs 0.5 Tb) and a third the power draw, the Brocade 48000 is a more scalable building block, regardless of the scale, functionality or lifetime of the fabric. Holistically or not, Brocade can match the “advanced functionality” that Cisco claims, all while using far less power and for a much [?? I think whoever wrote it was in a hurry]

That’s just too funny.

Filed under: Networking
5
Mar/10
8

The Smooth F5 Big-IP LTM Upgrade That Wasn’t

TechOps Guy: Tycen

A few weeks ago I attended an F5/VMware/Dell luncheon (where Dell failed to show up, something about a prelim ship date of 3 weeks out). After the event I talked to a couple of F5 engineers and asked them about upgrading our Big-IP LTM 3600’s from 9.4.7 to their latest 10.1.0. We have a redundant pair of 3600’s in active/standby mode. According to them, it was as easy as upgrading the standby node, failing over, and then upgrading the other node. We have a pretty basic config, not a lot of nodes/pools/virtual servers and no add-on modules (at this time). We used the default partitioning. Easy as pie.

I followed this F5 guide which for me basically boiled down to these steps:

  1. # mkdir /shared/images
  2. copy ISO to /shared/images
  3. # cd /shared/images
  4. # im – This copies over the image2disk utility, and then presents a status message, which lets you know that the im command is nolonger supported, and tells you how to proceed
  5. # image2disk –instslot=HD1.2 –format=volumes
  6. # switchboot -b HD1.2
  7. reboot

The trouble started with step 5 above. It gave me an error that I needed to re-activate my keys. Not a big problem, but still made me nervous since I had a narrow window to do this upgrade in. But, re-activation was easy through the web interface (System > License > Re-activate).

The next issue was more scary – after I re-activated I re-issued the command in step 5 and the Big-IP reboots automatically (no mention of this in the upgrade doc linked to above). And it takes FOREVER to reboot. I’m sure it’s doing a lot of really tricky stuff (reformatting and upgrading), but still it’s an anxious wait. For me it was about 12 minutes (the linked upgrade guide says between 3 and 7 minutes). I was just about to put my shoes on and head to the datacenter (30 miles away) when the pings started responding.

This is where things got ugly. When the newly updgraded node came back online, it took over and became the ACTIVE node! I was just barely getting logged into it when my internal monitoring reported that the load balancer had failed over. And, that wouldn’t have been too bad because of course I had done a config sync before I started this whole process, expect that the now active node couldn’t load it’s config (more on that below). It was sitting there with a blank config (it had the correct self IPs and HA config) and users were getting nothing, not even a maintenance page. So, I forced it to standby so the other node could at least serve a maintenance page while I figured out why it wasn’t loading the config.

The bigip.conf file was there and looked intact. I can’t remember now what pointed me in the right direction (maybe while doing a b load, but I finally figured out that it was missing some class files in /var/class. I had previously used Jason Rahm’s maintenance page generator script which creates some class files used for hosting a maintenance page. Apparently the upgrade wiped out those files and the config wouldn’t load without them. (sidenote: the iRule generated by that script isn’t compatible with 10.x – but there is a new version of the script – v2 – that detects what code you’re running and builds the iRule accordingly – I have yet to use it to generate a new maintenance page and iRule). I rsync’d the class files from the other Big-IP and that allowed the config to load. I was then able to fail back to the Big-IP with the 10.x code and it seems to be working fine. Now I just need to update the other node and pray it doesn’t try to take over after reboot.

The first node I updated was set as the preferred active node (System > High Availability > Redundancy), so maybe that’s why it took over after the upgrade/reboot. But, that would be a bug in my opinion since the other node was healthy and active. Setting this to “None” might have kept the unwanted failover from happening, but I’m not going to downgrade and find out.

Another (minor) annoying thing was that the SSH authorized_keys were wiped out, so some monitoring scripts I had set up didn’t work until I added the monitoring host’s key back in to the authorized_keys file.

One final thing, I did not need to do step 6. Running the switchboot command w/o any arguments shows that HD1.2 is the default and only boot image. And, as I detailed above, the reboot in step 7 was done for me – whether I was ready for it or not.

All in all, it was not a smooth upgrade. But, I’m sure there are a lot worse things that could have happened. And, hey, at least now 10.x has vim!

1
Mar/10
2

The future of networking in hypervisors – not so bright

TechOps Guy: Nate

UPDATED Some networking companies see that they are losing control of the data center networks when it comes to blades and virtualization. One has reacted by making their own blades, others have come up with strategies and collaborating on standards to try to take back the network by moving the traffic back into the switching gear. Yet another has licensed their OS to have another company make blade switches on their behalf.

Where at least part of the industry wants to go is move the local switching out of the hypervisor and back into the Ethernet switches. Now this makes sense for the industry, because they are losing their grip on the network when it comes to virtualization. But this is going backwards in my opinion. Several years ago we had big chassis switches with centralized switch fabrics where(I believe, kind of going out on a limb here) if port 1 on blade 1 wanted to talk to port 2, then it had to go back to the centralized fabric before port 2 would see the traffic. That’s a lot of distance to travel. Fast forward a few years and now almost every vendor is advertising local switching. Which eliminates this trip. Makes things faster, and more scalable.

Another similar evolution in switching design was moving from backplane systems to midplane systems. I only learned about some of the specifics recently, prior to that I really had no idea what the difference was between a backplane and a midplane. But apparently the idea behind a midplane is to drive significantly higher throughput on the system by putting the switching fabric closer to the line cards. An inch here, an inch there could mean hundreds of gigabits of lost throughput or increased complexity/line noise etc in order to achieve those high throughput numbers. But again, the idea is moving the fabric closer to what needs it, in order to increase performance. You can see examples of a midplane systems in blades with the HP c7000 chassis, or in switches in the Extreme Black Diamond 20808(page 7). Both of them have things that plug into both the front and the back. I thought that was mainly due to space constraints on the front, but it turns out it seems more about minimizing the distance of connectivity between the fabric on the back and the thing using the fabric on the front. Also note that the fabric modules on the rear are horizontal while the blades on the front are vertical, I think this allows the modules to further reduce the physical distance between the fabric and the device at the other end by directly covering more slots, less distance to travel on the midplane.

Moving the switching out of the hypervisor, if VM #1 wants to talk to VM #2, having that go outside of the server and make a U-turn and come right back into it is stupid. Really stupid. It’s the industry grasping at straws trying to maintain control when they should be innovating. It goes against the two evolutions in switching designs I outlined above.

What I’ve been wanting to see myself is to integrate the switch into the server. Have a X GbE chip that has the switching fabric built into it. Most modern network operating systems are pretty modular and portable(a lot of them seem to be based on Linux or BSD). I say integrate it onto the blade for best performance, maybe use the distributed switch frame work(or come up with some other more platform independent way to improve management). The situation will only get worse in coming years, with VM servers potentially having hundreds of cores and TBs of memory at their disposal, your to the point now practically where you can fit an entire rack of traditional servers onto one hypervisor.

I know that for example Extreme uses Broadcom in most all of their systems, and Broadcom is what most server manufacturers use as their network adapters, even HP’s Flex10 seems to be based on Broadcom? How hard can it be for Broadcom to make such a chip(set) so that companies like Extreme (or whomever else might use Broadcom in their switches) could program it with their own stuff to make it a mini switch?

From the Broadcom press release above (2008):

To date, Broadcom is the only silicon vendor with all of the networking components (controller, switch and physical layer devices) necessary to build a complete end-to-end 10GbE data center. This complete portfolio of 10GbE network infrastructure solutions enables OEM partners to enhance their next generation servers and data centers.

Maybe what I want makes too much sense and that’s why it’s not happening, or maybe I’m just crazy.

UPDATE - I just wanted to clarify my position here, what I’m looking for is essentially to offload the layer 2 switching functionality from the hypervisor to a chip on the server itself. Whether it’s a special 10GbE adapter that has switching fabric or a dedicated add-on card which only has the switching fabric. Not interested in offloading layer 3 stuff, that can be handled upstream.  Also interested in integrating things like ACLs, sFlow, QoS, rate limiting and perhaps port mirroring.