TechOpsGuys.com Diggin' technology every day

August 27, 2010

What should HP do with 3PAR

Filed under: Storage — Tags: — Nate @ 7:36 am

Assuming HP gets them, which I am optimistic will occur. This is what I think HP should do.

I’m sure this is all pretty obvious but it gives me something to write about 🙂

Why the changes to the EVA offerings and dropping of the F200 from 3PAR? To me, it all comes down to the 4 node architecture that 3PAR has, and the ability to offer persistent cache.

3PAR Persistent Cache is a resiliency feature designed to gracefully handle component failures by eliminating the substantial performance penalties associated with “write-through” mode. Supported on all quad-node and larger InServ arrays, Persistent Cache leverages the InServ’s unique Mesh-Active design to preserve write-caching by rapidly re-mirroring cache to the other nodes in the cluster in the event of a controller node failure.

(click on images for larger version)

Persistent cache allows service providers to operate at higher levels of utilization because they know they can maintain high performance even when a controller fails(or if two/three controllers fail in a 6/8 node T800 as long as they are the right nodes!), one of my former employers has a bunch of NetApp stuff, and I’m told they run them pretty much entirely active/passive, so as to protect performance in the event a controller fails. I’m sure that is a fairly common setup.

This is also useful during software upgrades, where the controllers have to be rebooted, or hardware upgrades (adding more FC ports or whatever).

Another reason is the ease of use around configuring multi site replication, and the ability to do synchronous long distance replication on the mid range systems.

3PAR® is the first storage vendor to offer autonomic disaster recovery (DR) configuration that enables you to set up and test your entire DR environment—including multi-site, multi-mode replication using both mid-range and high-end arrays—in just minutes.

[..]

Synchronous Long Distance replication combines the best of both worlds by offering the data integrity of synchronous mode disaster recovery and the extended distances (including cross-continental reach) possible with asynchronous replication. Remote Copy makes all of this possible without the complexity or professional services required by the monolithic vendors that offer multi-target disaster recovery products, and at half the cost or less.

I can understand why 3PAR came up with the F200, it is a bit cheaper, the only difference is the chassis the nodes go in, the nodes are the same, everything else is the same. So to me it’s a no brainer to spend the extra what 10-15% up front and get the capability to go to four controllers even if you don’t need that up front. Takes an extra 4U of rack space. If you really want to be cheap, go with the small 2-node EVA.

I find it kind of funny that on the main page for EVA, the EVA-4000’s blurb for what it is “Ideal for” is blank.

Assuming HP gets them, which I am optimistic will occur. This is what I think HP should do.

* Phase out current USP-based XP line with the 800-series of 3PAR systems, currently the T800
* Phase out the EVA Cluster with the enterprise 400-series of 3PAR systems, currently the T400
* Phase out the EVA 6400 and 8400 with the mid range 400-series of 3PAR systems, currently the F400
* Phase out the 3PAR F200, replace it with the EVA 4400-series

I’m sure this is all pretty obvious but it gives me something to write about 🙂

Why the changes to the EVA offerings and dropping of the F200 from 3PAR? To me, it all comes down to the 4 node architecture that 3PAR has, and the ability to offer persistent cache.

3PAR Persistent Cache is a resiliency feature designed to gracefully handle component failures by eliminating the substantial performance penalties associated with “write-through” mode. Supported on all quad-node and larger InServ arrays, Persistent Cache leverages the InServ’s unique Mesh-Active design to preserve write-caching by rapidly re-mirroring cache to the other nodes in the cluster in the event of a controller node failure.

(click on image for larger version)

Persistent cache allows service providers to operate at higher levels of utilization because they know they can maintain high performance even when a controller fails(or if two controllers fail in a 6/8 node T800 as long as they are the right nodes!), one of my former employers has a bunch of NetApp stuff, and I’m told they run them pretty much entirely active/passive, so as to protect performance in the event a controller fails. I’m sure that is a fairly common setup.

Another reason is the ease of use around configuring multi site replication, and the ability to do synchronous long distance replication on the mid range systems.

I can understand why 3PAR came up with the F200, it is a bit cheaper, the only difference is the chassis the nodes go in, the nodes are the same, everything else is the same. So to me it’s a no brainer to spend the extra what 10-15% up front and get the capability to go to four controllers even if you don’t need that up front. Takes an extra 4U of rack space. If you really want to be cheap, go with the small 2-node EVA.

August 26, 2010

Thank you HP

Filed under: News,Storage — Tags: — Nate @ 4:05 pm

That’s more like it, HP knows what they are doing, they just boosted their offer to $27/share for 3PAR.

SEATTLE (AP) — Hewlett-Packard Co. has again raised its bid for 3Par Inc. above an offer from rival Dell Inc., suggesting that the little-known data-storage maker could be worth more with one of the PC companies’ marketing muscle behind it.

The latest offer from HP for $27 per share in cash, or about $1.69 billion, is nearly three times what 3Par had been trading at before Dell made the first bid last week.

Bring it on. Did I ever mention 3PAR went IPO on my birthday? Coincidence yeah I know but maybe it was a sign.. if I recall right they were supposed to IPO one day earlier but something delayed it by one day. I never did buy any stock(as I mentioned before I don’t buy stocks or bonds).

This is a joke, right?

Filed under: News,Storage — Tags: — Nate @ 7:55 am

This is a joke, right?

So today, right after the jobless claims came out, Dell came out and increased their bid for 3PAR to $24.30, thirty cents above HP’s offer, which was $6.00 above Dell’s original offer.

Even now, hours later I can’t help but laugh, I mean this is a good example showing what kind of company Dell is. Why are they wasting everyone’s time with a mere 1% increase in their bid?

A survey recently done by Reuters came up with an estimated $29 final price for 3PAR.

[..] That’s why some analysts say traditional metrics aren’t sufficient in assessing the value of 3PAR — a small company with unique technology that could grow exponentially with the the massive salesforces of either Dell or HP.

This morning on Squawk Box folks were saying the next step is for HP to bid up again and get 3PAR to eliminate the price matching clause with Dell to level the playing field.

I keep seeing people ask who needs 3PAR more. I think it’s clear Dell needs them more, Dell has nothing right now. But I’m sure Dell will do a lot to screw up the 3PAR technology over time, so HP is the better fit, more innovative company with more market leadership and of course a lot more resources from pretty much every angle.

August 24, 2010

EMC and IBM’s Thick chunks for automagic storage tiering

Filed under: Storage,Virtualization — Tags: , , , , — Nate @ 12:59 pm

If you recall not long ago IBM released some SPC-1 numbers with their automagic storage tiering technology Easy Tier. It was noted that they are using 1GB blocks of data to move between the tiers. To me that seemed like a lot.

Well EMC announced the availability of FAST v2 (aka sub volume automagic storage tiering) and they too are using 1GB blocks of data to move between tiers according to our friends at The Register.

Still seems like a lot. I was pretty happy when 3PAR said they use 128MB blocks, which is half the size of their chunklets. I thought to myself when I first heard of this sub LUN tiering that you may want a block size as small as, I don’t know 8-16MB. At the time 128MB still seemed kind of big(before I had learned of IBM’s 1GB size).

Just think of how much time it takes to read 1GB of data off a SATA disk (since the big target for automagic storage tiering seems to be SATA + SSD).

Anyone know what size Compellent uses for automagic storage tiering?

August 23, 2010

HP FlexFabric module launched

Filed under: Datacenter,Networking,Storage,Virtualization — Tags: , , , , — Nate @ 5:03 pm

While they announced it a while back, it seems the HP VirtualConnect FlexFabric Module available for purchase for $18,500 (web price). Pretty impressive technology, Sort of a mix between FCoE and combining a Fibre channel switch and a 10Gbps Flex10 switch into one. The switch has two ports on it that can uplink (apparently) directly fiber channel 2/4/8Gbps. I haven’t read too much into it yet but I assume it can uplink directly to a storage array, unlike the previous Fibre Channel Virtual Connect module which had to be connected to a switch first (due to NPIV).

HP Virtual Connect FlexFabric 10Gb/24-port Modules are the simplest, most flexible way to connect virtualized server blades to data or storage networks. VC FlexFabric modules eliminate up to 95% of network sprawl at the server edge with one device that converges traffic inside enclosures and directly connects to external LANs and SANs. Using Flex-10 technology with Fibre Channel over Ethernet and accelerated iSCSI, these modules converge traffic over high speed 10Gb connections to servers with HP FlexFabric Adapters (HP NC551i or HP NC551m Dual Port FlexFabric 10Gb Converged Network Adapters or HP NC553i 10Gb 2-port FlexFabric Converged Network Adapter). Each redundant pair of Virtual Connect FlexFabric modules provide 8 adjustable connections ( six Ethernet and two Fibre Channel, or six Ethernet and 2 iSCSI or eight Ethernet) to dual port10Gb FlexFabric Adapters. VC FlexFabric modules avoid the confusion of traditional and other converged network solutions by eliminating the need for multiple Ethernet and Fibre Channel switches, extension modules, cables and software licenses. Also, Virtual Connect wire-once connection management is built-in enabling server adds, moves and replacement in minutes instead of days or weeks.

[..]

  • 16 x 10Gb Ethernet downlinks to server blade NICs and FlexFabric Adapters
  • Each 10Gb downlink supports up to 3 FlexNICs and 1 FlexHBA or 4 FlexNICs
  • Each FlexHBA can be configured to transport either Fiber Channel over Ethernet/CEE or Accelerated iSCSI protocol.
  • Each FlexNIC and FlexHBA is recognized by the server as a PCI-e physical function device with adjustable speeds from 100Mb to 10Gb in 100Mb increments when connected to a HP NC553i 10Gb 2-port FlexFabric Converged Network Adapter or any Flex-10 NIC and from 1Gb to 10Gb in 100Mb increments when connected to a NC551i Dual Port FlexFabric 10Gb Converged Network Adapter or NC551m Dual Port FlexFabric 10Gb Converged Network Adapter
  • 4 SFP+ external uplink ports configurable as either 10Gb Ethernet or 2/4/8Gb auto-negotiating Fibre Channel connections to external LAN or SAN switches
  • 4 SFP+ external uplink ports configurable as 1/10Gb auto-negotiating Ethernet connected to external LAN switches
  • 8 x 10Gb SR, LR fiber and copper SFP+ uplink ports (4 ports also support 10Gb LRM fiber SFP+)
  • Extended list of direct attach copper cable connections supported
  • 2 x 10Gb shared internal cross connects for redundancy and stacking
  • HBA aggregation on FC configured uplink ports using ANSI T11 standards-based N_Port ID Virtualization (NPIV) technology
  • Allows up to 255 virtual machines running on the same physical server to access separate storage resources
  • Up to 128 VLANs supported per Shared Uplink Set
  • Low latency (1.2 µs Ethernet ports and 1.7 µs Enet/Fibre Channel ports) throughput provides switch-like performance.
  • Line Rate, full-duplex 240Gbps bridging fabric
  • MTU up to 9216 Bytes – Jumbo Frames
  • Configurable up to 8192 MAC addresses and 1000 IGMP groups
  • VLAN Tagging, Pass-Thru and Link Aggregation supported on all uplinks
  • Stack multiple Virtual Connect FlexFabric modules with other VC FlexFabric, VC Flex-10 or VC Ethernet Modules across up to 4 BladeSystem enclosures allowing any server Ethernet port to connect to any Ethernet uplink

Management

  • Pre-configure server I/O configurations prior to server installation for easy deployment
  • Move, add, or change server network connections on the fly without LAN and SAN administrator involvement
  • Supported by Virtual Connect Enterprise Manager (VCEM) v6.2 and higher for centralized connection and workload management for hundreds of Virtual Connect domains. Learn more at: www.hp.com/go/vcem
  • Integrated Virtual Connect Manager included with every module, providing out-of-the-box, secure HTTP and scriptable CLI interfaces for individual Virtual Connect domain configuration and management.
  • Configuration and setup consistent with VC Flex-10 and VC Fibre Channel Modules
  • Monitoring and management via industry standard SNMP v.1 and v.2 Role-based security for network and server administration with LDAP compatibility
  • Port error and Rx/Tx data statistics displayed via CLI
  • Port Mirroring on any uplink provides network troubleshooting support with Network Analyzers
  • IGMP Snooping optimizes network traffic and reduces bandwidth for multicast applications such as streaming applications
  • Recognizes and directs Server-Side VLAN tags
  • Transparent device to the LAN Manager and SAN Manager
  • Provisioned storage resource is associated directly to a specific virtual machine – even if the virtual server is re-allocated within the BladeSystem
  • Server-side NPIV removes storage management constraint of a single physical HBA on a server blade Does not add to SAN switch domains or require traditional SAN management
  • Centralized configuration of boot from iSCSI or Fibre Channel network storage via Virtual Connect Manager GUI and CLI
  • Remotely update Virtual Connect firmware on multiple modules using Virtual Connect Support Utility 1.5.0

Options

  • Virtual Connect Enterprise Manager (VCEM), provides a central console to manage network connections and workload mobility for thousands of servers across the datacenter
  • Optional HP 10Gb SFP+ SR, LR, and LRM modules and 10Gb SFP+ Copper cables in 0.5m, 1m, 3m, 5m, and 7m lengths
  • Optional HP 8 Gb SFP+ and 4 Gb SFP optical transceivers
  • Supports all Ethernet NICs and Converged Network adapters for BladeSystem c-Class server blades: HP NC551i 10Gb FlexFabric Converged Network Adapters, HP NC551m 10Gb FlexFabric Converged Network Adapters, 1/10Gb Server NICs including LOM and Mezzanine card options and the latest 10Gb KR NICs
  • Supports use with other VC modules within the same enclosure (VC Flex-10 Ethernet Module, VC 1/10Gb Ethernet Module, VC 4 and 8 Gb Fibre Channel Modules).

So in effect this allows you to cut down on the number of switches per chassis from four to two, which can save quite a bit. HP had a cool graphic showing the amount of cables that are saved even against Cisco UCS but I can’t seem to find it at the moment.

The most recently announced G7 blade servers have the new FlexFabric technology built in(which is also backwards compatible with Flex10).

VCEM seems pretty scalable

Built on the Virtual Connect architecture integrated into every BladeSystem c-Class enclosure, VCEM provides a central console to administer network address assignments, perform group-based configuration management and to rapidly deployment, movement and failover of server connections for 250 Virtual Connect domains (up to 1,000 BladeSystem enclosures and 16,000 blade servers).

With each enclosure consuming roughly 5kW with low voltage memory and power capping, 1,000 enclosures should consume roughly 5 Megawatts? From what I see “experts” say it costs roughly ~$18 million per megawatt for a data center, so one VCEM system can manage a $90 million data center, that’s pretty bad ass. I can’t think of who would need so many blades..

If I were building a new system today I would probably get this new module, but have to think hard about sticking to regular fibre channel module to allow the technology to bake a bit more for storage.

The module is built based on Qlogic technology.

HP to the rescue

Filed under: Datacenter,Events,News,Storage — Tags: , , , , — Nate @ 6:03 am

Knock knock.. HP is kicking down your back door 3PAR..

Well that’s more like it, HP offered $1.6 Billion to acquire 3PAR this morning topping Dell’s offer by 33%. Perhaps the 3cV solution can finally be fully backed by HP. More info from The Register here. And more info on what this could mean to HP and 3PAR products from the same source here.

3PAR’s website is having serious issues, this obviously has spawned a ton of interest in the company, I get intermittent blank pages and connection refused messages.

I didn’t wake my rep up for this one.

The 3cV solution was announced about three years ago –

Elements of the 3cV solution include:

  • 3PAR InServ® Storage Servers—highly virtualized, tiered-storage arrays built for utility computing. Organizations creating virtualized IT infrastructures for workload consolidation use InServ arrays to reduce the cost of allocated storage capacity, storage administration, and SAN infrastructure.
  • HP BladeSystem c-Class Server Blades—the leading blade server infrastructure on the market for datacenters of all sizes. HP BladeSystem c-Class server blades minimize energy and space requirements and increase administrative productivity through advantages in I/O virtualization, powering and cooling, and manageability.
  • VMware vSphere—the leading virtualization platform for industry-standard servers. VMware vSphere helps customers reduce capital and operating expenses, improve agility, ensure business continuity, strengthen security, and go green.

While I could not find the image that depicts the 3cV solution(not sure how long it’s been gone for), here is more info on it for posterity.

The Advantages of 3cV
3cV offers combined benefits that enable customers to manage and scale their server and storage environments simply, allowing them to halve server, storage and operational costs while lowering the environmental impact of the datacenter.

  • Reduces storage and server costs by 50%—The inherently modular architectures of the HP BladeSystem c-Class and the 3PAR InServ Storage Server—coupled with the increased utilization provided by VMware Infrastructure and 3PAR Thin Provisioning—allow 3cV customers to do more with less capital expenditure. As a result, customers are able to reduce overall storage and server costs by 50% or more. High levels of availability and disaster recovery can also be affordably extended to more applications through VMware Infrastructure and 3PAR thin copy technologies.
  • Cuts operational costs by 50% and increases business agility—With 3cV, customers are able to provision and change server and storage resources on demand. By using VMware Infrastructure’s capabilities for rapid server provisioning and the dynamic optimization provided by VMware VMotion and Distributed Resource Scheduler (DRS), HP Virtual Connect and Insight Control management software, and 3PAR Rapid Provisioning and Dynamic Optimization, customers are able to provision and re-provision physical servers, virtual hosts, and virtual arrays with tailored storage services in a matter of minutes, not days. These same technologies also improve operational simplicity, allowing overall server and storage administrative efficiency to increase by 3x or more.
  • Lowers environmental impact—With 3cV, customers are able to cut floor space and power requirements dramatically. Server floor space is minimized through server consolidation enabled by VMware Infrastructure (up to 70% savings) and HP BladeSystem density (up to 50% savings). Additional server power requirements are cut by 30% or more through the unique virtual power management capabilities of HP Thermal Logic technology. Storage floor space is reduced by the 3PAR InServ Storage Server, which delivers twice the capacity per floor tile as compared to alternatives. In addition, 3PAR thin technologies, Fast RAID 5, and wide striping allow customers to power and cool as much as 75% less disk capacity for a given project without sacrificing performance.
  • Delivers security through virtualization, not dedicated hardware silos—Whereas traditional datacenter architectures force tradeoffs between high resource utilization and the need for secure segregation of application resources for disparate user groups, 3cV resolves these competing needs through advanced virtualization. For instance, just as VMware Infrastructure securely isolates virtual machines on shared severs, 3PAR Virtual Domains provides secure “virtual arrays” for private, autonomous storage provisioning from a single, massively-parallel InServ Storage Server.

Though due to the recent stack wars it’s been hard for 3PAR to partner with HP to promote this solution since I’m sure HP would rather push their own full stack. Well hopefully now they can. The best of both worlds technology wise can come together.

More details from 3PAR’s VMware products site.

From HP’s offer letter

We propose to increase our offer to acquire all of 3PAR outstanding common stock to $24.00 per share in cash. This offer represents a 33.3% premium to Dell’s offer price and is a “Superior Proposal” as defined in your merger agreement with Dell. HP’s proposal is not subject to any financing contingency. HP’s Board of Directors has approved this proposal, which is not subject to any additional internal approvals. If approved by your Board of Directors, we expect the transaction would close by the end of the calendar year.

In addition to the compelling value offered by our proposal, there are unparalleled strategic benefits to be gained by combining these two organizations. HP is uniquely positioned to capitalize on 3PAR’s next-generation storage technology by utilizing our global reach and superior routes to market to deliver 3PAR’s products to customers around the world. Together, we will accelerate our ability to offer unmatched levels of performance, efficiency and scalability to customers deploying cloud or scale-out environments, helping drive new growth for both companies.
As a Silicon Valley-based company, we share 3PAR’s passion for innovation.
[..]

We understand that you will first need to communicate this proposal and your Board’s determinations to Dell, but we are prepared to execute the merger agreement immediately following your termination of the Dell merger agreement.

Music to my ears.

[tangent — begin]

My father worked for HP in the early days back when they were even more innovative than they are today, he recalled their first $50M revenue year. He retired from HP in the early 90s after something like 25-30 years.

I attended my freshman year at Palo Alto Senior High school, and one of my classmates/friends (actually I don’t think I shared any classes with him now that I think about it) was Ben Hewlett, grandson of one of the founders of HP. Along with a couple other friends Ryan and Jon played a bunch of RPGs (I think the main one was Twilight 2000, something one of my other friends Brian introduced me to in 8th grade).

I remember asking Ben one day why he took Japanese as his second language course when it was significantly more difficult than Spanish(which was the easy route, probably still is?) I don’t think I’ll ever forget his answer. He said “because my father says it’s the business language of the future..”

How times have changed.. Now it seems everyone is busy teaching their children Chinese. I’m happy knowing English, and a touch of bash and perl.

I never managed to keep in touch with my friends from Palo Alto, after one short year there I moved back to Thailand for two more years of high school there.

[tangent — end]

HP could do some cool stuff with 3PAR, they have much better technology overall, I have no doubt HP has their eyes on their HDS partnership and the possibility of replacing their XP line with 3PAR technology in the future has got to be pretty enticing. HDS hasn’t done a whole lot recently, and I read not long ago that regardless what HP says, they don’t have much (if any) input into the HDS product line.

The HP USP-V OEM relationship is with Hitachi SSG. The Sun USP-V reseller deal was struck with HDS. Mikkelsen said: “HP became a USP-V OEM in 2004 when the USP-V was already done. HP had no input to the design and, despite what they say, very little input since.” HP has been a Hitachi OEM since 1999.

Another interesting tidbit of information from the same article:

It [HDS] cannot explain why it created the USP-V – because it didn’t, Hitachi SSG did, in Japan, and its deepest thinking and reasons for doing so are literally lost in translation.

The loss of HP as an OEM customer of HDS, so soon after losing Sun as an OEM customer would be a really serious blow to HDS(one person I know claimed it accounts for ~50% of their business), whom seems to have a difficult time selling stuff in western countries, I’ve read it’s mostly because of their culture. Similarly it seems Fujitsu has issues selling stuff in the U.S. at least, they seem to have some good storage products but not much attention is paid to them outside of Asia(and maybe Europe). Will HDS end up like Fujtisu as a result of HP buying 3PAR? Not right away for sure, but longer term they stand to lose a ton of market share in my opinion.

And with the USP getting a little stale (rumor has it they are near to announcing a technology refresh for it), it would be good timing for HP to get 3PAR, to cash in on the upgrade cycle by getting customers to go with the T class arrays instead of the updated USP whenever possible.

I read on an HP blog earlier in the year an interesting comment –

The 3PAR is drastically less expensive than an XP, but is an active/active concurrent design, can scale up to 8 clustered controllers, highly virtualized, customers can self-install, self-maintain, and requires no professional services. Its on par with the XP in terms of raw performance, but has the ease of use of the EVA. Like the XP, the 3PAR can be carved up into virtual domains so that service providers or multi-tenant arrays can have delegated administration.

I still think 3PAR is worth more, and should stay independent, but given the current situation would much rather have them in the arms of HP than Dell.

Obviously those analysts that said Dell paid too much for 3PAR were wrong, and didn’t understand the value of the 3PAR technology. HP does otherwise they wouldn’t be offering 33% more cash.

After the collapse of so many of 3PAR’s NAS partners over the past couple of years, the possibility of having Ibrix available again for a longer term solution is pretty good. Dell bought Exanet’s IP earlier in the year. LSI owns Onstor, HP bought Polyserve and Ibrix. Really just about no “open” NAS players left. Isilon seems to be among the biggest NAS players left but of course their technology is tightly integrated into their disk drive systems, same with Panasas.

Maybe that recent legal investigation into the board at 3PAR had some merit after all.

Dell should take their $billion and shove it in Pillar’s(or was it Compellent ? I forgot) face, so the CEO there can make his dream of being a billion dollar storage company come true, if only for a short time.

I’m not a stock holder or anything, I don’t buy stocks(or bonds).

August 16, 2010

Trying not to think about it

Filed under: News,Storage — Tags: , — Nate @ 6:54 am

Hell just got a little colder. It seems 3PAR was bought by Dell for ~$1.15 billion this morning(news is so fresh as of this posting the official 3PAR press release isn’t posted yet, just a blank page).. I woke my rep up and asked him what happened and he wasn’t aware that it had gone down, they did a good job at keeping it quiet.

It’s not like 3PAR was in any trouble, they had no debt, and the highest margins in the industry along with good sales. They haven’t been making too much profits mainly because they are hiring so many new people to grow the company. In my area since I started using 3PAR they’ve gone from 1 Sales and 1 SE to 3 Sales and 2 SEs, and they’ve really expanded over seas and stuff. I would of expected them to hold out for a few more billion, $1 seems far too cheap.

I have read several complaints about how Equallogic has gone downhill since Dell bought them (from original Equallogic users, not that I’ve ever used that stuff so don’t know whether or not they are accurate), I fear the same may happen to 3PAR. But it will take a little while for it to start.

I think the only hope 3PAR has at this point is if Dell keeps them independent for as long as possible. Outside of their DCS division Dell really shows they have no ability to innovate.

I wonder what Marc Farley thinks, as a former Equallogic/Dell employee now he’s at 3PAR, and Dell came and found him again..

Maybe I’ll get lucky and this will just turn out to be a bad dream, some evil hacker out there manipulating the stock price by planting news.

Do me one favor Dell, stay the hell away from Extreme Networks! With Brocade having bought Foundry, HP having bought 3COM. I was told by a Citrix guy that Juniper tried to buy Extreme shortly after they bought Netscreen instead of making their own switches, from what I recall he said Juniper bought Netscreen for $500M which was way over inflated, and Extreme demanded $1 billion at the time. There’s not many other Enterprise/Service provider independent Ethernet companies still around. There is Force10, Dell can go buy them be a lot cheaper too.

I suppose more than anything else, Dell buying 3PAR is Dell admitting the Equallogic technology doesn’t hold a candle to 3PAR technology, ok maybe a candle but not much more than that!

It may be 6:51AM but I think I need a drink.

August 13, 2010

Do you really need RAID 6

Filed under: Storage — Tags: , , , , , — Nate @ 11:34 pm

I’ve touched on this topic before but I don’t think I’ve ever done a dedicated entry on the topic. I came across a blog post from Marc Farley, which got my mind thinking on the topic again. He talks about a leaked document from EMC trying to educate their sales force to fight 3PAR in the field. One of the issues raised is 3PAR’s lack of RAID 6 (nevermind the fact that this is no longer true, 2.3.1 introduced RAID 6(aka RAID DP) in early January 2010).

RAID 6 from 3PAR’s perspective for the most part was mostly just a check box, because there are those customers out there that have hard requirements, they disregard the underlying technology and won’t even entertain the prospect unless it mets some of their own criteria.

What 3PAR did in their early days was really pretty cool, the way they virtualize the disks in the system which in turn distributes the RAID across many many disks. On larger arrays you can have well over 100,000 RAID arrays on the system. This provides a few advantages:

  • Evenly distributes I/O across every available spindle
  • Parity is distributed across every available spindle – no dedicated parity disks
  • No dedicated hot spare spindles
  • Provides a many:many relationship for RAID rebuilds
    • Which gives the benefit of near zero impact to system performance while the RAID is rebuilt
    • Also increases rebuild performance by orders of magnitude (depending on # of disks)
  • Only data that has been written to the disk is rebuilt
  • Since there are no spare spindles, only spare “space” on each disk, in the event you suffer multiple disk failures before having the failed disks swapped(say you have 10 disks fail over a period of a month and for whatever reason you did not have the drives replaced right away) the system will automatically allocate more “spare” space as long as there is available space to write to on the system. Unlike traditional arrays where you may find yourself low or even out of hot spares after multiple disks fail which will make you much more nervous and anxious to replace those disks than if it were a 3PAR system(or similarly designed system)

So do you need RAID 6?

To my knowledge the first person to raise this question was Robin from Storage Mojo, whom a bit over three years ago wrote a blog post talking about how RAID 5 will have lost it’s usefulness in 2009. I have been following Robin for a few years (online anyways), he seems like a real smart guy I won’t try to dispute the math. And I can certainly see how traditional RAID arrays with large SATA disks running RAID 5 are in quite a pickle, especially if there is a large data:parity ratio.

In the same article he speculates on when RAID 6 will become as “useless” as RAID 5.

I think what it all really comes down to is a couple of things:

  • How fast can your storage system rebuild from a failed disk
    • For distributed RAID this is determined by the number of disks participating in the RAID arrays and the amount of load on the system, because when a disk fails one RAID array doesn’t go into degraded mode, potentially hundreds of them do, which then triggers all of the remaining disks to help in the rebuild.
    • For 3PAR systems at least this is determined by how much data has actually been written to the disk.
  • What is the likelihood that a 2nd disk will fail(in the case of RAID 5) or two more disks(RAID 6) fail during this time?

3PAR is not alone with the distributed RAID. As I have mentioned before, others that I know of that have similar technology are at least : Compellent, Xiotech and IBM XIV. I bet there are others as well.

From what I understand of Xiotech’s technology I don’t *think* that RAID arrays can span their ISE enclosures, I think they are limited to a single enclosure(by contrast I believe a LUN can span enclosures), so for example if there are 30 disks in the enclosure and a disk fails the maximum number of disks that can participate in the rebuild is 30. Though in reality I think the number is less given how they RAID based on disk heads, the number of individual RAID arrays is far fewer vs 3PAR’s chunklet-based RAID.

I’ve never managed to get in depth info on Compellent’s or IBM XIV’s design with regards to specifics around how RAID arrays are constructed. Though I haven’t tried any harder than looking at what is publically available on their web sites.

Distributed RAID really changes the game in my opinion as far as RAID 5’s effective life span (same goes for RAID 6 of course).

Robin posted a more recent entry several months ago about the effectiveness of RAID 6, and besides on of the responders being me, there was another person that replied with a story that made me both laugh and feel sorry for the guy, a horrific experience with RAID 6 on Solaris ZFS with Sun hardware –

Depending on your Recovery Time Objectives, RAID6 and other dual-parity schemes (e.g. ZFS RAIDZ2) are dead today. We know from hard experience.

Try 3 weeks to recover from a dual-drive failure on 8x 500GB ZFS RAIDZ2 array.

It goes like this:
– 2 drives fail
– Swap 2 drives (no hot spares on this array), start rebuild
– Rebuild-while-operating took over one week. How much longer, we don’t know because …
– 2 more drives failed 1 week into the rebuild.
– Start restore from several week old LTO-4 backup tapes. The tapes recorded during rebuild were all corrupted.
– One week later, tape restore is finished.
– Total downtime, including weekends and holidays – about 3 weeks (we’re not a 24xforever shop).

Shipped chassis and drives back to vendor – No Trouble Found!

Any system that takes longer than say 48 hours to rebuild you probably do want that extra level of protection in there, whether it is dual parity or maybe even triple parity(something I believe ZFS offers now?).

Add to that disk enclosure/chassis/cage(3PAR term) availability which means you can lose an entire shelf of disks without disruption, which means in their S/T class systems 40 disks can go down and your still ok(protection against a shelf failing is the default configuration and is handled automatically – this can be disabled upon request of the user since it does limit your RAID options based on the number of shelves you have).  So not only do you need to suffer a double disk failure but that 2nd disk has to fail:

  • In a DIFFERENT drive chassis than the original disk failure
  • Happens to be a disk that has portions of RAID set(s) that were also located on the original disk that failed

But if you can recover from a disk failure in say 4 hours even on a 2TB disk with RAID 5, do you really need RAID 6? I don’t know what the math might look like but would be willing to bet that a system that takes 3 days to rebuild a RAID 6 volume has about as much of a chance of suffering a triple disk failure as a system that takes 4 hours (or less) to rebuild a RAID 5 array suffering a double disk failure.

Think about the probability of the two above bullet points on how a 2nd drive must fail in order to cause data loss, combine that with the fast rebuild of distributed RAID, and cosnider whether or not you really need RAID 6. Do you want to take the I/O hit ? Sure it is an easy extra layer of protection, but you might be protecting yourself that is about as likely to happen as a natural disaster taking out your data center.

I mentioned to my 3PAR rep a couple of weeks ago about the theory of RAID 6 with “cage level availability” has the potential of being able to protect against two shelves of disks failing(so you can lose up to 80 disks on the big arrays) without impact. I don’t know if 3PAR went this far to engineer their RAID 6, I’ve never seen it mentioned so I suspect not, but I don’t think there is anything that would stop them from being able to offer this level of protection at least with RAID 6 6+2.

Myself I speculate that on a decently sized 3PAR system (say 200-300 disks) SATA disks probably have to get to 5-8TB in size before I think I would really think hard about RAID 6. That won’t stop their reps from officially reccomending RAID 6 with 2TB disks though.

I can certainly understand the population at large coming to the conclusion that RAID 5 is no longer useful, because probably 99.999% of the RAID arrays out there (stand alone arrays as well as arrays in servers) are not running on distributed RAID technology. So they don’t realize that another way to better protect your data is to make sure the degraded RAID arrays are rebuilt (much) faster, lowering the chance of additional disk failures occurring at the worst possible time.

It’s nice that they offer the option, let the end user decide whether or not to take advantage of it.

June 19, 2010

40 Million IOPS in two racks

Filed under: Storage — Tags: , , — Nate @ 7:43 am

Fusion IO does it again, another astonishing level of performance in such an efficient design, from the case study:

LLNL used Fusion’s ioMemory technology to create the world’s highest performance storage array. Using Fusion’s ioSANs and ioDrive Duos, the cluster achieves an unprecedented 40,800,000 IOPS and 320GB/s aggregate bandwidth.
Incredibly, Fusion’s ioMemory allowed LLNL to accomplish this feat in just two racks of appliances– something that would take a comparable hard disk-based solution over 43 racks. In fact, it would take over 100 of the SPC-1 benchmark’s leading all-flash vendor systems combined to match the performance, at a cost of over $300 million.

40 Million IOPS @ ~250 IOPS per 15K RPM disk your talking 160,000 disk drives.

Not all flash is created equal of course, many people don’t understand that. They just see ooh this one is cheap, this one is not, not having any clue (shocker).

It’s just flat out irresponsible to ignore such a industry changing technology, especially for workloads that deal with small (sub TB) amounts of data.

June 15, 2010

Storage Benchmarks

Filed under: Storage — Tags: , — Nate @ 11:27 pm

There are two main storage benchmarks I pay attention to:

Of course benchmarks are far from perfect, but they can provide a good starting point when determining what type of system you need to look towards based on your performance needs. Bringing in everything under the sun to test in house is a lot of work, much of it can be avoided by getting some reasonable expectations up front. Both of these benchmarks do a pretty good job. And it’s really nice to have the database of performance results for easy comparison. There’s tons of other benchmarks that can be used but very few have a good set of results you can check against.

SPC-1 is better as a process primarily because it forces the vendor to disclose the cost of the configuration and 3 years of support. They could improve on this further by forcing the vendor to provide updating pricing to the configuration for 3 years, while the performance of the configuration should not change(given the same components), the price certainly should decline over time so the cost aspects become harder to compare the further apart the tests(duh).

SPC-1 also forces full disclosure of everything required to configure the system, down to the CLI commands to configure the storage. You can get a good idea on how simple or complex the system is by looking at this information.

SPC-1 doesn’t have a specific disclosure field for Cost per Usable TB. But it is easy enough to extrapolate from the other numbers in the reports, it would be nice if this was called out(well nice for some vendors, not so much for others). Cost per usable TB can really make systems that utilize short stroking to get performance stand out like a sore thumb. Another metric that would be interesting would be Watts per IOP and Watts per usable TB. The SPC-1E test reports Watts per IOP,  though I have hard time telling whether or not the power usage is taken at max load, seems to indicate power usage was calculated at 80% load.

Storage performance is by no means the only aspect you need to consider when getting a new array, but it usually is in at least the top 5.

I made a few graphs for some SPC-1 numbers, note the cost numbers need to be taken with a few grains of salt of course depending on how close the systems were tested, the USP for example was tested in 2007. But you can see trends at least.

The majority of the systems tested used 146GB 15k RPM disks.

Somewhat odd configurations:

  • From IBM, the easy tier config is the only one that uses SSD+SATA, and the other IBM system is using their SAN Volume Controller in a clustered configuration with two storage arrays behind it (two thousand spindles).
  • Fujitsu is short stroking 300GB disks.
  • NetApp is using RAID 6 (except IBM’s SSDs everyone else is RAID 1)

I’ve never been good with spreadsheets or anything so I’m sure I could make these better, but they’ll do for now.

Links to results:

« Newer PostsOlder Posts »

Powered by WordPress