TechOpsGuys.com Diggin' technology every day

22Jun/12Off

NetApp Cluster SPC-1

Sorry for the off topic posts recently - here is something a little more on topic.

I don't write about NetApp much, mainly because I believe they have some pretty decent technology, they aren't a Pillar or an Equallogic. Though sometimes I poke fun. BTW did you hear about that senior Oracle guy that got canned recently and the comments he made about Sun? Oh my, was that funny. I can only imagine what he thought of Pillar. Then there are the folks that are saying Oracle is heavily discounting software so they can sell hardware at list price thus propping up the revenues, net result is Oracle software folks hate Sun. Not a good situation to be in. I don't know why Oracle couldn't of just of been happy owning BEA Jrockit JVM and let Sun whither away.

Anyways...

NetApp tried to make some big news recently when they released their newest OS, Ontap 8.1.1. For such a minor version number change (8.1 -> 8.1.1) they sure did try to raise a big fuss about it. Shortly after 8.1 came out I came across some NetApp guy's blog who was touting this release quite heavily. I was interested in some of the finer points and tried to ask some fair technical questions - I like to know the details. Despite me being a 3PAR person I tried, really hard to be polite and balanced, and the blogger was very thoughtful, informed and responsive and gave a great reply to my questions.

Anyways I'm still sort of un clear what is really new in 8.1.1 vs 8.1 - it sounds to me like it's just some minor changes from a technical side and they slapped some new marketing on top of it. Well I think the new Hybrid aggregates are perhaps specifically new to 8.1.1 (Also I think some new form of Ontap that can run in a VM for small sites). Maybe 8.1 by itself didn't make a big enough splash. Or maybe 8.1.1 is what 8.1 was supposed to be (I think I saw someone mention that perhaps 8.1 was a release candidate or something). The SpecSFS results posted by NetApp for their clusters are certainly pretty impressive from a raw performance standpoint. They illustrate excellent scalability up to 24 nodes.

But the whole story isn't told in the SpecSFS results - partially because things like cost are not disclosed in the results, partially because it doesn't illustrate the main weakness of the system in that it's not a single file system, it's not automatically balanced from either a load or a space perspective.

But I won't harp on that much this post is about their recent SPC-1 results which I just stumbled upon. These are the first real SPC-1 results NetApp has posted in almost four years - you sort of have to wonder what took them so long. I mean they did release some SPC-1E results a while back but those are purely targeting energy measurements. For me at least, energy usage is probably not even in the top 5 things I look for when I want some new storage. The only time I really care about energy usage is if the installation is really, really small. I mean like the whole site being less than one rack. Energy efficiency is nice but there are a lot of things that are higher on my priority list.

This SPC-1 result from them is built using a 6-node cluster, 3TB of flash cache and 288 GB of data cache spread across the controllers, and only 432 disks - 144 x 450GB per pair of controllers protected with RAID DP. The cost given is $1.67M for the setup. They say it is list pricing - so not being a customer of theirs I'm not sure if it's apples to apples compared to other setups - some folks show discounted pricing and some show list - I would think it would heavily benefit the tester to illustrate the typical price a customer would pay for the configuration.

  • 250,039 IOPS  @ 3.35ms latency  ($6.89 per SPC-1 IOP)
  • 69.8TB Usable capacity ($23,947 per Usable TB)

Certainly a very respectable I/O number and really amazing latency - I think this is the first real SPC-1 result that is flash accelerated (as opposed to being entirely flash).

What got me thinking though was the utilization. I ragged on what could probably be considered a tier 3 or 4 storage company a while back for just inching by the SPC-1 minimum efficiency requirements. The maximum unused storage cannot exceed 45% and that company was at 44.77%.

Where's NetApp with this ? Honestly higher than I thought especially considering RAID DP they are at 43.20% unused storage. I mean really - would it not make more sense to simply use RAID 10 and get the extra performance ? I understand that NetApp doesn't support RAID 10 but it just seems a crying shame to have such low utilization of the spindles. I really would of expected the Flash cache to allow them to drive utilization up. But I suppose they decided to inch out more performance at the cost of usable capacity. I'd honestly be fascinated to see results when they drive unused storage ratio down to say 20%.

The flash cache certainly does a nice job at accelerating reads and letting the spindles run more writes as a result. Chuck over at EMC wrote an interesting post where he picked apart the latest NetApp release. What I found interesting from an outsider perspective is how so much of this new NetApp technology feels bolted on rather than integrated. They seem unable to adapt the core of their OS with this (now old) scale out Spinmaker stuff even after this many years have elapsed.  From a high level perspective the new announcements really do sound pretty cool. But once I got to know more aobut what's on the inside,  I became less enthusiastic about them. There's some really neat stuff there but at the same time some pretty dreadful shortcomings in the system still (see the NetApp blog posts above for info).

The plus side though is that at least parts of NetApp are becoming more up front with where they target their technology. Some of the posts I have seen recently both in comments on The Register as well as the NetApp blog above have been really excellent. These posts are honest in that they acknowledge they can't be everything to everyone, they can't be the best in all markets. There isn't one storage design to rule them all. As EMC's Chuck said - compromise. All storage systems have some degree of compromise in them, NetApp always seems to have had less compromise on the features and more compromise on the design. That honesty is nice to see coming from a company like them.

I met with a system engineer of theirs about a year ago now when I had a bunch of questions to ask and I was tired of getting pitched nothing but dedupe. This guy from NetApp came out and we had a great talk for what must've been 90 minutes. Not once was the word dedupe used and I learned a whole lot more about the innards of the platform. It was the first honest discussion I had had with a NetApp rep in all the years I had dealt with them off and on.

At the end of the day I still wasn't interested in using the storage but felt that hey - if some day I really feel a need to combine the best storage hardware, with what many argue to say is the best storage software (management headaches aside e.g no true active-active automagic load balanced clustering), I can - just go buy a V-series and slap it in front of a 3PAR. I did it once before (really only because there was no other option at the time). I could do it again. I don't plan to (at least not at the company I'm at now). But the option is there. Just as long as I don't have to deal with the NetApp team in the Northwest and their dirty underhanded threatening tactics. I'm in the Bay area now so that shouldn't be hard. The one surprising thing I heard from the reps here is they still can't do evaluations. Which just seems strange to me. The guy told me if a deal hinged on an evaluation he wouldn't know what to do.

3PAR of course has no such flash cache technology shipping today, something I've brought up with the head of HP storage before. I have been wanting them to release something like it (more specifically more like EMC's Fast cache - EMC has made some really interesting investments in Flash over recent years - but like NetApp - at least for me the other compromises involved in using an EMC platform doesn't make me want to use it over a 3PAR even though they have this flash technology) for some time now. I am going to be visiting 3PAR HQ soon and will learn a lot of cool things I'm sure that I won't be able to talk about for some time to come.

TechOps Guy: Nate

Tagged as: , Comments Off
Comments (2) Trackbacks (0)
  1. Hi Nate, Dimitris from NetApp here.

    A couple of things…

    1. Regarding the SPC-1E publication with the NetApp 3270 system (article here: http://bit.ly/hs4GMt) – the SPC-1E benchmark is the EXACT SAME workload as the SPC-1 benchmark, with the addition of energy calculations. That is all. The 3270 system did well over 500 IOPS/disk in that benchmark at an 84% full utilization (used vs usable). That is far beyond RAID10 performance at much higher usable space. That was the first SPC-1 workload that was flash-accelerated (aside from what WAFL can do for writes).

    2. For an analysis of the current result check here (incl. the comments): http://bit.ly/Mp4uu0

    Again, well over 500 IOPS/drive and with extreme low latency. RAID10 (from anyone’s box) can’t do this stuff, Nate… at any utilization number, even with short-stroking. So there’s no point for NetApp to be using R10. RAID-DP (and RAID6) have better reliability than R10 anyway.

    As to why the price is list… it’s because we always show list pricing. I adjusted in my analysis the table so you can compare list vs list. Let’s just say our discounts are no worse than the competition’s.

    As with the older 3270 result, WAFL can do over 500 IOPS/disk in SPC-1 at up to 85% full. I wish we’d filled it up more since, so far, that’s the only negative thing the detractors have to say (even though most of the other vendors have similar utilization numbers and R10, but anyway – nobody seems to be focusing on the fact that we utterly crushed the rest in the important IOPS/latency battle).

    Regarding evaluations – NetApp will do them, at least in my area (IL). More frequently it will be a Right of Return – meaning that, if the gear fulfills your pre-defined test criteria, you buy it. If you just want to test functionality, we have a simulator…

    Finally – there is no perfect system. No matter what you buy you have to give up something. Does the NetApp design mean some things are not as “nice” as on certain other systems? Maybe. But if I can win the battle on 90% of the features an enterprise will need, and my competition only has 40%, I’m still ahead.

    I beat performance-wise boxes that have cooler-sounding architectures all the time. I’ve beat VMAX and VSP for SQL at a PoC using the customer’s own benchmark. Dynamic active-active matrices sound great but if I can deliver more oomph with my design, and the DB is working 40% faster with WAFL, who cares in the end…

    Some advice: be careful of being a “3Par person” as you say. Be a technologist instead. Storage arrays are just tools. Some tools are better for some very specialized jobs than others. Certain other tools have a lot more uses.

    Either way, the idea is to get the tools that help your company be more productive and flexible.

    I’ve worked at companies that pretty much HAD to buy gear from certain vendors.

    That was no fun.

    It’s storage, not religion.

    Thx

    D

  2. Per 1 – yeah I can see your point – I just viewed the configuration as a little too small and wanted to see something bigger.

    I have been historically biased against NetApp primarily just because of my experiences with the people at the company and the products in general haven’t been positive. Though the folks I dealt with last year were quite a bit better. My last NetApp box was a 3170 V-series and it was just frustrating to manage/balance, compared to the Exanet it was trying to replace. In the end the company tossed out the 3170 because NetApp threatened them (again terrible sales folks).

    I do view myself as a technologist, I write about a wide variety of things here from networking, to storage to servers to virtualization to cloud to whatever. I can’t get exposure to every technology on the planet but I write about what I am exposed to.

    I’d get bored fast if I stuck with just storage :)

    thanks for the comment!


Trackbacks are disabled.