TechOpsGuys.com Diggin' technology every day

March 9, 2010

Yawn..

Filed under: Networking — Tags: — Nate @ 9:42 am

I was just watching some of my daily morning dose of CNBC and they had all these headlines about how Cisco was going to make some earth shattering announcement(“Change the internet forever”), and then the announcement hit, some new CRS-1 router, that claimed 12x faster performance than the competition. So naturally I was curious. Robert Paisano on the floor of the NYSE was saying how amazing it was that the router could download the library of congress in 1 second(he probably didn’t understand the router would have no place to put it).

If I want a high end router that means I’m a service provider and in that case my personal preference would be for Foundry Networks (now Brocade). Juniper makes good stuff too of course though honestly I am not nearly as versed in their technology. Granted I’ll probably never work for such a company as those companies are really big and I prefer small companies.

But in any case wanted to illustrate (another) point. According to Cisco’s own site, their fastest single chassis system has a mere 4.48 terrabits of switching capacity. This is called the CRS-3, which I don’t even see listed as a product on their site, perhaps it’s yet to come. The biggest, baddest product they have on their site right now is a 16-slot CRS-1. This according to their own site, has a total switching capacity of a paltry 1.2Tbps, and even worse a per-slot capacity of 40Gbps (hello 2003).

So take a look at the Foundry Networks (the Brocade name makes me shudder, I have never liked them) , their NetIron XMR series. From their documentation the “total switching fabric”, ranges from 960 gigabits on the low end to 7.68 terrabits on the high end. Switch forwarding capacity ranges from 400 gigabits to 3.2 terrabits. This comes out to 120 gigabits of full duplex switch fabric per slot (same across all models). While I haven’t been able to determine precisely how long XMR has been on the market I have found evidence that it is at least nearly 3 years old.

To put it in another perspective, in a 48U rack with the new CRS-3 you can get 4.48 terrabits of switching fabric(1 chassis is 48U). With Foundry in the same rack you can get one XMR32k and one XMR16k(combined size 47U) for a total of 11.52 terrabits of switching fabric. More than double the fabric in the same space, from a product that is 3 years old. And as you can imagine in the world of IT, 3 years is a fairly significant amount of time.

And while I’m here and talking about Foundry and Brocade take a look at this from Brocade, it’s funny it’s like something I would write. Compares the Brocade Director switches vs Cisco (“Numbers don’t lie”). One of my favorite quotes:

To ensure accuracy, Brocade hired an independent electrician to test both the Brocade 48000 and the Cisco MDS 9513 and found that the 120 port Cisco configuration actually draws 1347 watts, 45% higher than Cisco’s claim of 931 watts. In fact, an empty 9513 draws more electrical current (5.6 amps) than a fully-populated 384 port Brocade 48000 (5.2 amps). Below is Brocade’s test data. Where are Cisco’s verified results?

Another

With 33% more bandwidth per slot (64Gb vs 48Gb), three times as much overall bandwidth (1.5Tb vs 0.5 Tb) and a third the power draw, the Brocade 48000 is a more scalable building block, regardless of the scale, functionality or lifetime of the fabric. Holistically or not, Brocade can match the “advanced functionality” that Cisco claims, all while using far less power and for a much [?? I think whoever wrote it was in a hurry]

That’s just too funny.

March 4, 2010

Dell/Denali Servers/Storage luncheon March 25th

Filed under: Events,Storage — Nate @ 11:06 am

Been a while since I posted an event, but if your looking for new servers/storage for your Exchange setup this event may be a good excuse to get away from work for a while.

Choose the Right Storage Solution for your Microsoft Exchange Environment
Thursday March 25th, 2010
11:30am – 1:30pm
El Gaucho
City Center Plaza
450 108th Ave NE
Bellevue, WA 98004

Join us for a complimentary technical seminar and learn how the Dell EqualLogic PS Series storage solution and Microsoft Exchange, deployed on Dell PowerEdge servers can deliver[..]

Myself I don’t expect to learn anything, and  3PAR storage can run exchange for a large number of users(from these numbers you could extrapolate a max of 192,000 mailboxes on a single storage system each with a heavy I/O profile), so not really in the market for some Equallogic storage. BUT I like to get away, especially if it’s local. I do find it curious that the event is specifically about Exchange, that is the mindset of dedicated storage to a particular application. When the industry trend seems to be leaning towards storage that is shared amongst many applications. Given that Microsoft doesn’t appear to be an event sponsor, I find this idea curious.

Thought this was interesting as well, Microsoft recommends RAID 1 for Exchange but (from one of the links above)..

Internal tests performed by 3PAR show that using RAID 5 (7+1)—i.e., seven data blocks per parity block—demonstrated that the same simulated Exchange workload used for Exchange 2007 ESRP testing had disk latencies that were higher than RAID 1 but well within Microsoft’s recommendations[..]

Going from RAID 1+0 to RAID 5+0 (7+1) is a pretty dramatic shift, showing how fast their “Fast” RAID is, and of course if you find out you laid data out incorrectly you can fix it on the fly. I wonder what Dell will say about their stuff.

March 2, 2010

Avere front ending Isilon

Filed under: Storage — Tags: , , , — Nate @ 1:21 pm

UPDATED

How do all these cool people find our blog? A friendly fellow from Isilon commented that apparently the article from The Register isn’t accurate in that Avere is front ending NetApp gear not Isilon. But in any case I have been thinking about Avere and the Symantec stuff off and on recently anyways.. END UPDATE

A really interesting article over at The Register about how Sony has deployed an Avere cluster(s) to front end their Isilon(and perhaps other) gear too. A good quote:

The thing that grabs your attention here is that Avere is being used to accelerate some of the best scale-out NAS on the planet, not bog standard filers with limited scalability.

Avere certainly has some good performance metrics(pay attention to the IOPS per physical disk), and more recently they introduced a model that run on top of SSD, I haven’t seen any performance results for it yet but I’m sure it’s a significant boost. As The Register mentions in their article if this technology really is good enough for this purpose it has the potential(of course) to be extremely disruptive in the industry, wrecking havoc with many of the remaining (and very quickly dwindling) smaller scale out NAS vendors. Kind of funny really seeing how Isilon spun the news.

From Avere’s site, in talking about comparing Spec SFS results:

A comparison of these results and the number of disks required shows that Avere used dramatically fewer disks. BlueArc used 292 disks to achieve 146,076 ops/sec with 3.34 ms ORT. Exanet used 592 disks to achieve 119,550 ops/sec with 2.07ms ORT (overall response time). HP used 584 disks to achieve 134,689 ops/sec and 2.53 ms ORT. Huawei Symantec used 960 disks to achieve 176,728 ops/sec with 1.67ms ORT. NetApp used 324 disks to achieve 120,011 ops/sec with 1.95ms ORT. By contrast, Avere used only 79 drives to achieve 131,591 ops/sec with 1.38ms ORT. Doing a little math, Avere achieves 3.3, 8.2, 7.2, 9.0, and 4.5 times more ops/sec per disk used than the other vendors.

Which got me thinking again, Symantec last year released a Filestore product, my friends over at 3PAR were asking me if I was interested in it. To-date I have not been because the only performance numbers released to-date have been not very efficient. And it’s still a new product so who knows how well it works in the real world, granted that Symantec does have a history of file systems with their Norton File System (NFS) product.

Unfortunately there isn’t much technical info on the Filestore product on their web site.

Built to run on commodity servers and most storage arrays, FileStore is an incredibly simple-to-install soft appliance. This combination of low-cost hardware, “pay as you grow” scalability and easy administration give FileStore a significant cost advantage over specialized appliances. With support for both SAN and iSCSI storage, FileStore delivers the performance needed for the most demanding applications.

It claims N-way active-active or active-passive clustering, up to 16 nodes in a cluster, up to 2PB of storage and 200 million files per file system. Which for most people is more than enough. I don’t know how it is licensed though or how well it scales on a single node, could it run on a aforementioned 48-all-round system?

Where does 3PAR fit into this? Well Symantec was the first company(so far the only one that I know of) to integrate Thin Reclamation into their file system, which integrates really well with 3PAR arrays at least. The file system uses some sort of SCSI command which is passed back to the array when files are deleted/reclaimed. So that the I/O never hits the spindles, the array transparently re-maps the blocks to be available for use.

3PAR Thin Reclamation for Veritas Storage Foundation keeps storage volumes thin over time by allowing granular, automated, non-disruptive space reclamation within the InServ array. This is accomplished by communicating deleted block information to the InServ using the Thin Reclamation API. Upon receiving this information, the InServ autonomically frees this allocated but unused storage space. The thin reclamation capabilities provide environments using Veritas Storage Foundation by Symantec an easy way to keep their thin volumes thin over time, especially in situations where a large number of writes and deletes occur.

But I was thinking that you could front end one of these Filestore clusters with an Avere cluster and get some pretty flexible high performing storage.

Something I’d like myself to explore at some point.

March 1, 2010

The future of networking in hypervisors – not so bright

Filed under: Networking,Virtualization — Nate @ 10:15 pm

UPDATED Some networking companies see that they are losing control of the data center networks when it comes to blades and virtualization. One has reacted by making their own blades, others have come up with strategies and collaborating on standards to try to take back the network by moving the traffic back into the switching gear. Yet another has licensed their OS to have another company make blade switches on their behalf.

Where at least part of the industry wants to go is move the local switching out of the hypervisor and back into the Ethernet switches. Now this makes sense for the industry, because they are losing their grip on the network when it comes to virtualization. But this is going backwards in my opinion. Several years ago we had big chassis switches with centralized switch fabrics where(I believe, kind of going out on a limb here) if port 1 on blade 1 wanted to talk to port 2, then it had to go back to the centralized fabric before port 2 would see the traffic. That’s a lot of distance to travel. Fast forward a few years and now almost every vendor is advertising local switching. Which eliminates this trip. Makes things faster, and more scalable.

Another similar evolution in switching design was moving from backplane systems to midplane systems. I only learned about some of the specifics recently, prior to that I really had no idea what the difference was between a backplane and a midplane. But apparently the idea behind a midplane is to drive significantly higher throughput on the system by putting the switching fabric closer to the line cards. An inch here, an inch there could mean hundreds of gigabits of lost throughput or increased complexity/line noise etc in order to achieve those high throughput numbers. But again, the idea is moving the fabric closer to what needs it, in order to increase performance. You can see examples of a midplane systems in blades with the HP c7000 chassis, or in switches in the Extreme Black Diamond 20808(page 7). Both of them have things that plug into both the front and the back. I thought that was mainly due to space constraints on the front, but it turns out it seems more about minimizing the distance of connectivity between the fabric on the back and the thing using the fabric on the front. Also note that the fabric modules on the rear are horizontal while the blades on the front are vertical, I think this allows the modules to further reduce the physical distance between the fabric and the device at the other end by directly covering more slots, less distance to travel on the midplane.

Moving the switching out of the hypervisor, if VM #1 wants to talk to VM #2, having that go outside of the server and make a U-turn and come right back into it is stupid. Really stupid. It’s the industry grasping at straws trying to maintain control when they should be innovating. It goes against the two evolutions in switching designs I outlined above.

What I’ve been wanting to see myself is to integrate the switch into the server. Have a X GbE chip that has the switching fabric built into it. Most modern network operating systems are pretty modular and portable(a lot of them seem to be based on Linux or BSD). I say integrate it onto the blade for best performance, maybe use the distributed switch frame work(or come up with some other more platform independent way to improve management). The situation will only get worse in coming years, with VM servers potentially having hundreds of cores and TBs of memory at their disposal, your to the point now practically where you can fit an entire rack of traditional servers onto one hypervisor.

I know that for example Extreme uses Broadcom in most all of their systems, and Broadcom is what most server manufacturers use as their network adapters, even HP’s Flex10 seems to be based on Broadcom? How hard can it be for Broadcom to make such a chip(set) so that companies like Extreme (or whomever else might use Broadcom in their switches) could program it with their own stuff to make it a mini switch?

From the Broadcom press release above (2008):

To date, Broadcom is the only silicon vendor with all of the networking components (controller, switch and physical layer devices) necessary to build a complete end-to-end 10GbE data center. This complete portfolio of 10GbE network infrastructure solutions enables OEM partners to enhance their next generation servers and data centers.

Maybe what I want makes too much sense and that’s why it’s not happening, or maybe I’m just crazy.

UPDATE – I just wanted to clarify my position here, what I’m looking for is essentially to offload the layer 2 switching functionality from the hypervisor to a chip on the server itself. Whether it’s a special 10GbE adapter that has switching fabric or a dedicated add-on card which only has the switching fabric. Not interested in offloading layer 3 stuff, that can be handled upstream.  Also interested in integrating things like ACLs, sFlow, QoS, rate limiting and perhaps port mirroring.

ProCurve Not my favorite

Filed under: Networking,Virtualization — Nate @ 10:06 pm

I gotta find something new to talk about, after this..

I was thinking this evening and thought about my UCS/HP network shootout post I posted over the weekend and thought maybe I came across too strong in favor of HP’s networking gear.

As all three of you know, HP is not my favorite networking vendor. Not even my second favorite, or even my third.

But they do have some cool technology with this Virtualconnect stuff. I only wish blade interfaces were more standardized.

« Newer Posts

Powered by WordPress