TechOpsGuys.com Diggin' technology every day

September 1, 2015

HP 3PAR Case study with my organization

Filed under: Storage — Tags: , — Nate @ 7:39 am
Stella & Dot HP 3PAR Case Study

Stella & Dot HP 3PAR Case Study (click to download)

Stella & Dot relies on HP 3PAR StoreServ Storage

“Highly reliable HP 4-node storage architecture supports over 30,000 e-commerce Independent Business Owners worldwide”

This is also available on HP’s case study website http://www.hp.com/go/success

I have had this blog since July 2009, and I don’t believe ever once have I mentioned any of my employers names. This will be an exception to that record.

HP came to us last year when we were in the market for their 3PAR 7450-all flash system. There was some people in management over there that really liked our company’s brand and I’m told practically everyone in 3PAR is aware of me. So they wanted to do a case study with the company I work for on our usage of 3PAR. I have participated in one, or maybe two 3PAR case studies in the past, prior to HP acquisition. The last one was in 2010 on a 3PAR T400. That particular company I was with had a policy where nobody with less than a VP title could be quoted. So my boss’s boss got all the credit even though of course I did all of the magic. Coincidentally I left that company just a few months later(for a completely different reason).

This company is different, I’ve had an extremely supportive management for my entire four years at this company and the newest management that joined in late 2013/early 2014 has been even more supportive. They really wanted me to get as much credit as I could get for all the hard work I do. So it’s my name all over the case study not theirs.  It’s people like this that more than anything keep me happy in my current role, I don’t see myself going anywhere anytime soon (added bonus is the company is stable and I believe will have no trouble surviving the next tech crash without an issue since we aren’t tech oriented).

Anyway, the experience with HP in making the case study was quite good. They are very patient, we said we needed time to work with the new system before we did the case study. They told us take all the time we want, no rush. About 8 months into using the new 7450 they reached out again and we agreed to start the process.

I spent a couple hours on the phone with them, and exchanged a few emails. They converted a lot of my technical jargon into marketing jargon (I don’t actually talk like that!), though the words seemed reasonable to me. The only thing that is kind of a mistake in the article is we don’t leverage any of the 3PAR “Application Suites”. I mentioned this to them saying they can remove those references if they wish, I didn’t care either way. At the end they also make reference to a support event I had with 3PAR five years ago which was at the previous company, and they credited HP for it when technically it was well before the acquisition(told them that as well, though seems reasonable to credit HP to some extent for that since I’d wager the same staff that performed those actions worked at HP for a while anyways, or maybe they are still there).

I would wager that my feedback into the benefits I see with 3PAR are probably not typical among HP customers. The HP Solutions Architect assigned to our account has told me on several occasions that he believes I operate my 3PAR systems better than most any other customer he’s seen, that made me feel good even though I already felt I operate my systems pretty well!

On that note, our SaaS monitoring service Logic Monitor is working with me to try to formalize my custom 3PAR monitoring I wrote (which gathers about 12,000 data points a minute from our three arrays) into something more customers can leverage, and if they can get that done then I hope I can get HP to endorse their service for monitoring 3PAR because it really works well in general, and better than any other tool I’ve seen or used for my 3PAR monitoring needs at least.

3PAR 8000 & 20450

I’m pretty excited about the new 8000-series and the new 20450 (4-node 20k series) that came out a few days ago. I would say really excited but given 3PAR’s common architecture, the newer form factors were already expected by me. I am quite happy that HP released an 8440 with the exact same specs as the 8450 (meaning 3X more data cache than the 8400 and more, faster CPU cores), also the 4-node 20450 has the same cache and CPU allocations per-node that the all-flash 8-node 20850 has. This means you can get these systems and not be “forced” into an all flash configuration, since the 8450 and the 20850 systems are “marketing limited” to be all flash(to make people like Gartner happy).

June 1, 2015

3PAR Gen5: True flash scale storage

Filed under: Storage — Tags: , — Nate @ 6:14 am

(I’m in Vegas, waiting for HP Discover to start(got here Friday afternoon), this time around I had my own company pay my way, so HP isn’t providing me with the trip. Since I haven’t been blogging much the past year I didn’t feel good about asking HP to cover me as a blogger.)

UPDATED – 6/4/2015

When flash first started becoming a force to be reckoned with a few years ago in enterprise storage it was clear to me controller performance wasn’t up to the task of being able to exploit the performance potential of the medium. The ratio of controllers to SSDs was just way out of whack.

In my opinion this has been largely addressed in the 3PAR Generation 5 systems that are being released today.

3PAR 20k system in the flesh

3PAR 20k system in the flesh

NEW HP 3PAR 20800

I believe this replaces the 10800/V800 (2-8 controllers) system, I believe it also replaces the 10400/V400(2-4 controllers) system as well.

  • 2-8 controllers, max 96 x 2.5Ghz CPU cores(12/controller) and 16 Generation 5 ASICs (2/controller)
  • 224GB of RAM cache per controller (1.8TB total – 768GB control / 1,024GB data)
  • Up to 32TB of Flash read cache
  • Up to 6PB of raw capacity (SSD+HDD)
  • Up to 15PB of usable capacity w/deduplication
  • 12Gb SAS back end, 16Gb FC (Max 160 ports) / 10Gb iSCSI (Max 80 ports) front end
  • 10 Gigabit ethernet replication port per controller
  • 2.5 Million Read IOPS under 1 millisecond
  • Up to 75 Gigabytes/second throughput (read), up to 30 Gigabytes/second throughput (write)

NEW HP 3PAR 20850

This augments the existing 7450 with a high end 8-controller capable all flash offering, similar to the 20800 but with more adrenaline.

  • 2-8 controllers, max 128 x 2.5Ghz CPU cores(16/controller), and 16 Generation 5 ASICs (2/controller)
  • 448GB of RAM cache per controller (3.6TB total – 1,536GB control – 2,048GB data)
  • Up to 4PB of raw capacity (SSD only)
  • Up to 10PB of usable capacity w/deduplication
  • 12Gb SAS back end, 16Gb FC (Max 160 ports) / 10Gb iSCSI (Max 80 ports) front end
  • 10 Gigabit ethernet replication port per controller
  • 3.2 Million Read IOPS under 1 millisecond
  • 75 Gigabytes/second throughput (read), 30 Gigabytes/second throughput (write)

Even though flash is really, really fast, 3PAR leverages large amounts of cache to optimize writes to the back end media, this not only improves performance but extends SSD life. If I recall correctly nearly 100% of the cache on the all SSD-systems is write cache, reads are really cheap so very little caching is done on reads.

It’s only going to get faster

One of the bits of info I learned is that the claimed throughput by HP is a software limitation at this point. The hardware is capable of much more and performance will improve as the software matures to leverage the new capabilities of the Generation 5 ASIC (code named “Harrier 2”).

Magazine sleds are no more

Since the launch of the early 3PAR high end units more than a decade ago 3PAR had leveraged custom drive magazines which allowed them to scale to 40×3.5″ drives in a 4U enclosure. That is quite dense, though they did not have similar ultra density for 2.5″ drives. I was told that nearline drives represent around 10% of disks sold on 3PAR so they decided to do away with these high density enclosures and go with 2U enclosures without drive magazines. This allows them to scale to 48×2.5″ drives in 4U (vs 40 2.5″ in 4U before), but only 24×3.5″ drives in 4U (vs 40 before). Since probably greater than 80% of the drives they ship now are 2.5″ that is probably not a big deal. But as a 3PAR historian(perhaps) I found it interesting.

Along similar lines, the new 20k series systems are fully compatible with 3rd party racks. The previous 10k series as far as I know the 4-node variant was 3rd party compatible but 8-node required a custom HP rack.

Inner guts of a 3PAR 20k-series controller

With dual internal SATA SSDs for the operating system(I assume some sort of mirroring going on), eight memory slots for data cache(max 32GB per slot) – these are directly connected to the pair of ASICs(under the black heatsinks). Another six memory slots for the control cache(operating system, meta data etc) also max 32GB per slot those are controlled by the Intel processors under the giant heat sinks.

3PAR 20k series controller

3PAR 20k series controller

How much does it cost?

I’m told the new 20k series is surprisingly cost effective in the market. The price point is significantly lower than earlier generation high end 3PAR systems. It’s priced higher than the all flash 7450 for example but not significantly more, entry level pricing is said to be in the $100,000 range, and I heard a number tossed around that was lower than that. I would assume that the blanket thin provisioning licensing model of the 7000 series extends to the new 20k series, but I am not certain.

It is only available in a 8-controller capable system at this time, so requires at least 16U of space for the controllers since they are connected over the same sort of backplane as earlier models. Maybe in the future HP will release a smaller 2-4 controller capable system or they may leave that to whatever replaces the 7450. I hope they come out with a smaller model because the port scalability of the 7000-series(and F and E classes before them) is my #1 complaint on that platform, having only one PCIe expansion slot/controller is not sufficient.

NEW HP 3PAR 3.84TB cMLC SSD

HP says that this is the new Sandisk Optimus MAX 4TB drive, which is targeted at read intensive applications. If a customer decides to use this drive in non read intensive(and non 3PAR) then the capacity will drop to 3.2TB. So with 3PAR’s adaptive sparing they are able to gain 600GB of capacity on the drive while simultaneously supporting any workload without sacrificing anything.

This is double the size of the 1.92TB drive that was released last year. It will be available on at least the 20k and the 7450, most likely all other Gen4 3PAR platforms as well.

HP says this drops the effective cost of flash to $1.50/GB usable.

This new SSD comes with the same 5 year unconditional warranty that other 3PAR SSDs already enjoy.

I specifically mention the 7450 here because this new SSD effectively doubles the raw capacity of the system to 920TB of raw flash today (vs 460TB before). How many all flash systems scale to nearly a petabyte of raw flash?

NEW HP 3PAR Persistent Checksum

With the Generation 4 systems 3PAR had end to end T10 data integrity checking within the array itself from the HBAs to the ASICs, to the back end ports and disks/SSDs. Today they are extending that to the host HBAs, and fibre channel switches as well (not sure if this extends to iSCSI connections or not).

The Generation 5 ASIC has a new line rate SHA1 engine which replaces the line rate CRC engine in Generation 4 for even better data protection. I am not certain if persistent checksum is Generation 5 specific(given they are extending it beyond the array I really would expect it to be possible in Generation 4 as well).

NEW HP 3PAR Asynchronous Streaming Replication

I first heard about this almost two years ago at HP Storage Tech Day, but today it’s finally here. HP adds another method of replication to the existing sets they already had:

  • Synchronous replication – 0 data loss (strict latency limits)
  • Synchronous long distance replication (requires 3 arrays) – 0 data loss (latency limits between two of the three arrays)
  • Asynchronous replication – as low as 5 minutes of data loss (less strict latency limits)
  • Asynchronous streaming replication – as low as 1 second of data loss (less strict latency limits)

HP compares this to EMC’s SRDF async replication which has as low as a 15 seconds of data loss, vs 3PAR with as low as 1 second.

If for some reason more data comes into the system than the replication link can handle, the 3PAR will automatically go into asynchronous replication mode until the replication is caught up then switch back to asynchronous streaming.

This new feature is available on all Gen4 and Gen5 systems.

NEW Co-ordinated Snapshots

3PAR has long had the ability to snapshot multiple volumes on a single system simultaneously, and it’s always been really easy to use. Now they have extended this to be able to snapshot across multiple arrays simultaneously and make them application aware (in the case of VMware initially, Exchange, SQL Server and Oracle to follow).

This new feature is available on all Gen4 and Gen5 systems.

HP 3PAR Storage Federation

Up to 60PB of usable capacity and 10 Million IOPS with zero overhead

3PAR Federation

3PAR Federation

HP has talked about Storage Federation in the past, today with the new systems of course the capacity knobs have been turned up a lot, they’ve made it easier to use than earlier versions of the software, though don’t yet have completely automatic load balancing between arrays yet.

This federation is possible between all Gen4 and Gen5 systems.

Benefits from ASIC Acceleration

3PAR has always use in house custom ASICs on their systems and these are no different

The ASICs within each HP 3PAR StoreServ 20850 and 20800 Storage controller node serve as the high-performance engines that move data between three I/O buses, a four memory-bank data cache, and seven high-speed links to the other controller nodes over the full-mesh backplane. These ASICs perform RAID parity calculations on the data cache and inline zero-detection to support the system’s data compaction technologies. CRC Logical Block Guard used by T10-DIF is automatically calculated by the HBAs to validate data stored on drives with no additional CPU overhead. An HP 3PAR StoreServ 20800 Storage system with eight controller nodes has 16 ASICs totaling 224 GB/s of peak interconnect bandwidth.

3PAR 208x0 Controller Architecture

3PAR 208×0 Controller Architecture

3par-mixedworkload

NEW Online data import from HDS arrays

You are now able to do online import of data volumes from Hitachi arrays in addition to the EMC VMAX, CX4, VNX, and HP EVA systems.

The competition

HP touts the scalability of usable and raw flash capacity of these new systems + the new 3.84TB SSD against their competition:

  • Consolidate thirty Pure Storage //m70 storage systems onto a single 3PAR 20850 (with 87% less power/cooling/space) ***
  • Consolidate eight XtremeIO storage systems onto a single 3PAR 20850 (with 62% less power/cooling/space)
  • Consolidate three EMC VMAX 400K storage systems onto a single 3PAR 20850 (with 85% less power/cooling/space)

HP also touts their throughput numbers (75GB/second) are between two and ten times faster than the competition. The 7450 came in at only 5.5GB/second, so this is quite a step up.

*** HP revised their presentation last minute their original claims were against the Pure 450, which was replaced by the m70 on the same day of the 3PAR announcement. The numbers here are from memory from a couple of days ago they may not be completely accurate.

Fastest growing in the market

HP touted again 3PAR was the fastest growing all flash in the market last year. They also said they have sold more than 1,000 all flash systems in the first half which is more than Pure Storage sold in all of last year. In other talks with 3PAR folks specifically on market share they say they are #1 in midrange in Europe and #2 in Americas, with solid growth across the board consistently for many quarters now. 3PAR is still #5 in the all flash market, part of that is likely due to compression(see below), but I have no doubt this new generation of systems will have a big impact on the market.

Still to come

Compression remains a road map item, they are working on it, but obviously not ready for release today. Also this marks probably the first 3PAR hardware released in more than a decade that wasn’t accompanied by SPC-1 results. HP says SPC-1 is coming, and it’s likely they will do their first SPC-2 (throughput) test on the new systems as well.

Bottom line

HP continues to show that it’s 3PAR architecture is fully capable of embracing the all flash era and has a long life left in it. Not only are you getting the maturity of the enterprise proven 3PAR systems (over a decade at this point), but you are not having to compromise on almost anything else related to all flash(compression being the last holdout).

September 17, 2014

NetApp Flash ray ships… with one controller

Filed under: Storage — Tags: , , — Nate @ 10:55 am

Well I suppose it is finally out, or at least in a “limited” way. NetApp apparently is releasing their ground-up rewrite all Flash product Flash Ray, based on a new “MARS” operating system (not related to Ontap).

When I first heard about MARS I heard some promising things, I suppose all of those things were just part of the vision, obviously not where the product is today on launch day. NetApp has been carefully walking back expectations all year. Which turned out to be a smart move, but it seems they didn’t go far enough.

To me it is obvious that they felt severe market pressures and could no longer risk not going to market without their next gen platform available. It’s also obvious that Ontap doesn’t cut it for flash or they wouldn’t of built Flash Ray to begin with.

But shipping a system that only supports a single controller I don’t care if it’s a controlled release or not – giving any customer such a system under any circumstance other than alpha-quality testing just seems absurd.

The “vision” they have is still a good one, on paper anyway — I’m really curious how long it takes them to execute on that vision — given the time it took to integrate the Spinmaker stuff into Ontap. Will it take several years?

In the meantime while your waiting for this vision to come out I wonder what NetApp will offer to get people to want to use this product vs any one of the competing solutions out there. Perhaps by the time this vision is complete this first or second generation of systems will be obsolete anyway.

Current FlashRay system seems to ship with less than 10TB of usable flash (in one system).

On a side note there was some chatter recently about a upcoming EMC XtremIO software update that apparently requires total data loss (or backup & restore) to perform. I suppose that is a sign that the platform is 1) not mature and 2) not designed right(not fully virtualized).

I told 3PAR management back at HP Discover – three years ago they could of counted me as among the people who did not believe 3PAR architecture would be able to adapt to this new era of all flash. I really didn’t have confidence at that time. What they’ve managed to accomplish over the past two years though has just blown me away, and gives me confidence their architecture has many years of life left to it. The main bit missing still is compression – though that is coming.

My new all flash array is of course a 7450 – to start with 4 controllers and ~27TB raw flash (16×1.92TB SSDs), a pair of disk shelves so I can go to as much as ~180TB raw flash (in 8U) without adding any shelves (before compression/dedupe of course). Cost per GB is obviously low(relative to their competition), performance is high(~105k IOPS @ 90% write in RAID 10 @ sub 1ms latency – roughly 20 fold faster than our existing 3PAR F200 with 80x15k RPM in RAID 5 — yes my workloads are over 90% write from a storage perspective), and they have the mature, battle hardened 3PAR OS (used to be named InformOS) running on it.

June 9, 2014

3PAR June 2014: The 7450 AFA keeps getting better

Filed under: Storage — Tags: , , — Nate @ 3:16 pm

(I don’t know if I need one of those disclaimer things at the top here that says HP paid for my hotel and stuff in Vegas for Discover because I learned about this before I got here and was going to write about it anyway, but in any case know that..)

All about Flash

The 3PAR announcements at HP Discover this week are all about HP 3PAR’s all flash array the 7450, which was announced at last year’s Discover event in Las Vegas. HP has tried hard to convince the world that the 3PAR architecture is competitive even in the new world of all flash. Several of the other big players in storage – EMC, NetApp, and IBM have all either acquired companies specialized in all flash or in the case of NetApp they acquired and have been simultaneously building a new system(apparently called Flash Ray which folks think will be released later in the year).

Dell and HDS, like HP have not decided to do that, instead relying on in house technology for all flash use cases. Of course there have been a ton of all flash startups, all trying to be market disruptors.

So first a brief recap of what HP has done with 3PAR to-date to optimize for all flash workloads:

  • Faster CPUs, doubling of the data cache (7400 vs 7450)
  • Sophisticated monitoring and alerting with SSD wear leveling (alert at 90% of max endurance, force fail the SSD at 95% max endurance)
  • Adaptive Read cache – only read what you need, does not attempt to read ahead because the penalty for going back to the SSD is so small, and this optimizes bandwidth utilization
  • Adaptive write cache – only write what you require, if 4kB of a 16kB page is written then the array only writes 4kB, which reduces wear on the SSD who typically has a shorter life span than that of spinning rust.
  • Autonomic cache offload – more sophisticated cache flushing algorithms (this particular one has benefits for disk-based 3PARs as well)
  • Multi tenant I/O processing – multi threaded cache flushing, and supports both large(typically sequential) and small I/O sizes(typically random) simultaneously in an efficient manor – separates the large I/Os into more efficient small I/Os for the SSDs to handle.
  • Adaptive sparing – basically allows them to unlock hidden storage capacity (upwards of 20%) on each SSD to use for data storage without compromising anything.
  • Optimize the 7xxx platform by leveraging PCI Express’ Message Signal Interrupts which allowed the system to reach a staggering 900,000 IOPS at 0.7 millisecond response times (caveat that is a 100% read workload)

I learned at HP Storage tech day last year that among the features 3PAR was working on was:

  • In line deduplication for file and block
  • Compression
  • File+Object services running directly on 3PAR controllers

There were no specifics given at the time.

Well part of that wait is over.

Hardware-accelerated deduplication

In what I believe is an industry exclusive, somehow 3PAR has managed to find some spare silicon in their now 3-year old Gen4 ASIC to give complete CPU-offloaded inline deduplication for transactional workloads on their 7450 all flash array.

They say that the software will return typically a 4:1 to 10:1 data reduction levels. This is not meant to compete against HP StoreOnce which offers much higher levels of data reduction, this is for transaction processing (which StoreOnce cannot do) and primarily to reduce the cost of operating an all flash system.

It has been interesting to see 3PAR evolve, as a customer of theirs for almost eight years now. I remember when NetApp came out and touted deduplication for transactional workloads and 3PAR didn’t believe in the concept due to the performance hit you would(and they did) take.

Now they have line rate(I believe) hardware deduplication so that argument no longer applies. The caveat, at least for this announcement is this feature is limited to the 7450. There is nothing technically that prevents it from getting to their other Gen4 systems whether it is the 7200, 7400, the 10400, and 10800. But support for those is not mentioned yet, I imagine 3PAR is beating their drum to the drones out there who might be discounting 3PAR still because they have a unified architecture between AFA and hybrid flash/disk and disk-only systems(like mine).

One of 3PAR’s main claims to fame is that you can crank up a lot of their features and they do not impact system performance because most of it is performed by the ASIC, it is nice to see that they have been able to continue this trend, and while it obviously wasn’t introduced on day one with the Gen4 ASIC, it does not require customers wait, or upgrade their existing systems (whenever the Gen5 comes out I’d wager December 2015) to the next generation ASIC to get this functionality.

The deduplication operates using fixed page sizes that are 16kB each, which is a standard 3PAR page size for many operations like provisioning.

For 3PAR customers note that this technology is based on Common Provisioning Groups(CPG). So data within a CPG can be deduplicated. If you opt for a single CPG on your system and put all of your volumes on it, then that effectively makes the deduplication global.

Express Indexing

This is a patented approach which allows 3PAR to use significantly less memory than would be otherwise required to store lookup tables.

Thin Clones

Thin clones are basically the ability to deduplicate VM clones (I imagine this would need hypervisor integration like VAAI)  for faster deployment. So you could probably deploy clones at 5-20x the speed at which you could before.

NetApp here too has been touting a similar approach for a few years on their NFS platform anyway.

Two Terabyte SSDs

Well almost 2TB, coming in at 1.9TB, these are actually 1.6TB cMLC SSDs  but with the aforementioned adaptive sparing it allows 3PAR to bump the usable capacity of the device way up without compromising on any aspect of data protection or availability.

These appear to be Sandisk Lightning Read-intensive SSDs. I found that info here(PDF).

Info on ordering Sandisk SSDs for HP 3PAR

Info on ordering Sandisk SSDs for HP 3PAR

I also quote aforementioned PDF

The 1920GB is available only in the StoreServ 7450 until the end of September 2014.

It will then be available in other models as well October 2014.

New 1.6TB SSD I believe comes from Sandisk

New 1.6TB SSD I believe comes from Sandisk, not the fastest around but certainly a bunch faster than a 15k RPM disk!

These SSDs come with a five year unconditional warranty, which is better than the included warranty on disks on 3PAR(three year). This 5-year warranty is extended to the 480GB and 920GB MLC SSDs as well. Assuming Sandisk is indeed the supplier as they claim the 5-year warranty exceeds the manufacturer’s own 3-year warranty.

These are technically consumer grade however HP touts their sophisticated flash features that make the media effectively more reliable than it otherwise might be in another architecture, and that claim is backed by the new unconditional warranty.

These are much more geared towards reads vs writes, and are significantly lower cost on a per GB basis than all previous SSD offerings from HP 3PAR.

The cost impact of these new SSDs is pretty dramatic, with the per GB list cost dropping from about $26.50 this time last year to about $7.50 this year.

These new SSDs allow for up to 460TB of raw flash on the 7450, which HP claims is seven times more than Pure Storage(which is a massively funded AFA startup), and 12 times more than a four brick EMC ExtremIO system.

With deduplication the 7450 can get upwards of 1.3PB of usable flash capacity in a single system along with 900,000 read I/Os with sub millisecond response times.

Dell Compellent about a year or so ago updated their architecture to leverage what they called read optimized low cost SSDs, and updated their auto tiering software to be aware of the different classes of SSDs. There are no tiering enhancements announced today, in fact I suspect you can’t even license the tiering software on a 3PAR 7450 since there is only one “tier” there.

So what do you get when you combine this hardware accelerated deduplication and high capacity low cost solid state media?

Solid state at less than $2/GB usable

HP says this puts solid state costs roughly in line with that of 15k RPM spinning disk. This is a pretty impressive feat. Not a unique one, there are other players out there that have reached the same milestone, but that is obviously not the only arrow 3PAR has in their arsenal.

That arsenal, is what HP believes is the reason you should go 3PAR for your all flash workloads. Forget about the startups, forget about EMC’s Xtrem IO, forget about NetApp Flash Ray, forget about IBM’s TMS flash systems etc etc.

Six nines of availability, guaranteed

HP is now willing to put their money where their mouth is and sign a contract that guarantees six nines of availability on any 4-node 3PAR system (originally thought it was 7450-specific it is not). That is a very bold statement to make in my opinion. This obviously comes as a result of an architecture that has been refined over the past roughly fifteen years and has some really sophisticated availability features including:

  • Persistent ports – very rapid fail over of host connectivity for all protocols in the event of planned or unplanned controller disruption. They have laser loss detection for fibre channel as well which will fail over the port if the cable is unplugged. This means that hosts do not require MPIO software to deal with storage controller disruptions.
  • Persistent cache – rapid re-mirroring of cache data to another node in the event of planned or unplanned controller disruption. This prevents the system from going into “write through” mode which can otherwise degrade performance dramatically. The bulk of the 128 GB of cache(4-node) on a 7450 is dedicated to writes (specifically optimizing I/O for the back end for the most efficient usage of system resources).
  • The aforementioned media wear monitoring and proactive alerting (for flash anyway)

They have other availability features that span systems(the guarantee does not require any of these):

I would bet that you’ll have to follow very strict guidelines to get HP to sign on the dotted line, no deviation from supported configurations. 3PAR has always been a stickler for what they have been willing to support, for good reason.

Performance guaranteed

HP won’t sign on a dotted line for this, but with the previously released Priority Optimization, customers can guarantee their applications:

  • Performance minimum threshold
  • Performance maximum threshold (rate limiting)
  • Latency target

In a very flexible manor, these capabilities (combined) I believe are still unique in the industry (some folks can do rate limiting alone).

Online import from EMC VNX

This was announced about a month or so ago, but basically HP makes it easy to import data from a EMC VNX without any external appliances, professional services, or performance impact.

This product (which is basically a plugin written to interface with VNX/CX’s SMI-S management interface) went generally available I believe this past Friday. I watched a full demo of it at Discover and it was pretty neat. It does require direct fibre channel connections between the EMC and 3PAR system, and it does require (in the case of Windows anyway that was the demo) two outages on the server side:

  • Outage 1: remove EMC Powerpath  – due to some damage Powerpath leaves behind you must also uninstall and re-install the Microsoft MPIO software using the standard control panel method. These require a restart.
  • Outage 2: Configure Microsoft MPIO to recognize 3PAR (requires restart)

Once the 2nd restart is complete the client system can start using the 3PAR volumes as the data migrates in the background.

So online import may not be the right term for it, since the system does have to go down at least in the case of Windows configuration.

The import process currently supports Windows and Linux. The product came about as a result of I believe the end of life status of the MPX 2000 appliance which HP had been using to migrate data. So they needed something to replace that functionality so they were able to leverage the Peer Motion technology on 3PAR already that was already used for importing data from HP EVA storage and extend it to EMC.  They are evaluating the possibility of extending this to more platforms – I guess VNX/CX was an easy first target given there are a lot of those out there that are old and there isn’t an easier migration path than the EMC to 3PAR import tool (which is significantly easier and less complex apparently than EMC’s options). One of the benefits that HP touts of their approach is it has no impact on host performance as the data goes directly between the arrays.

The downside to this direct approach is the 7000-series of 3PAR arrays are very limited in port counts, especially if you happen to have iSCSI HBAs in them(as did the F and E classes before them). 10000-series has a ton of ports though. I learned last year at HP storage tech day, was HP was looking at possibly shrinking the bigger 10000-series controllers for the next round of mid range systems (rather than making the 7000-series controllers bigger) in an effort to boost expansion capacity. I’d like to see at least 8 FC-host and 2 iSCSI Host per controller on mid range. Currently you get only 2 FC-host if you have a iSCSI HBA installed in a 7000-series controller.

The tool is command line based, there are a few basic commands it does and it interfaces directly with the EMC and 3PAR systems.

The tool is free of charge as far as I know, and while 3PAR likes to tout no professional services required HP says some customers may need assistance (especially at larger scales) planning migrations, if this is the case then HP has services ready to hold your hand through every step of the way.

What I’d like to see from 3PAR still

Read and write SSD caches

I’ve talked about it for years now, but still want to see a read (and write – my workloads are 90%+ write) caching system that leverages high endurance SSDs on 3PAR arrays. HP announced SmartCache for Proliant Gen8 systems I believe about 18 months ago with plans to extend support to 3PAR but that has not yet happened. 3PAR is well aware of my request, so nothing new here. David did mention that they do want to do this still, no official timelines yet. Also it sounded like they will not go forward with the server-side SmartCache integration with 3PAR (I’d rather have the cache in the array anyway and they seem to agree).

David Scott - SVP and GM of all of HP Storage talking with us bloggers in the lounge

David Scott – SVP and GM of all of HP Storage talking with us bloggers in the lounge

3PAR 7450 SPC-1

I’d like to see SPC-1 numbers for the 7450 especially with these new flash media, it ought to provide some pretty compelling cost and performance numbers. You can see some recent performance testing that was done(that wasn’t 100% read) on a four node 7450 on behalf of HP.

Demartek also found that the StoreServ 7450 was not very heavily taxed with a single OLTP database accessing the array. As a result, we proceeded to run a combination of database workloads including two online transaction processing (OLTP) workloads and a data warehouse workload to see how well this storage system would handle a fairly heavy, mixed workload.

HP says the lack of SPC-1 comes down to priorities, it’s a decent amount of work to do the tests and they have had people working on other things, they still intend to do them but not sure when it will happen.

Compression

Would like to see compression support, not sure whether or not that will have to wait for Gen5 ASIC or if there are more rabbits hiding in the Gen4 hat.

Feature parity

Certainly want to see deduplication come to the other Gen4 platforms. HP touts a lot about the competition’s flash systems being silos. I’ll let you in on a little secret – the 3PAR 7450 is a flash silo as well. Not a technical silo, but one imposed on the product by marketing. While the reasons behind it are understandable it is unfortunate that HP feels compelled into limiting the product to appease certain market observers.

Faster CPUs

I was expecting a CPU refresh on the 7450, which was launched with an older generation of processor because HP didn’t want to wait for Intel’s newest chip to launch their new storage platform. I was told last year the 7450 is capable of operating with the newer chip, so it should just be a matter of plugging it in and doing some testing. That is supposed to be one of the benefits of using x86 processors is you don’t need to wait years to upgrade. HP says the Gen4 ASIC is not out of gas, the performance numbers to-date are limited by the CPU cores in the system, so faster CPUs would certainly benefit the system further without much cost.

No compromises

At the end of the day the 3PAR flash story has evolved into one of no compromises. You get the low cost flash, you get the inline hardware accelerated deduplication, you get the high performance with multitenancy and low latency, you get all of that and your not compromising on any other tier 1 capabilities(too many to go into here, you can see past posts for more info). Your getting a proven architecture that has matured over the past decade, a common operating system, the only storage platform that leverages custom ASICs to give uncompromising performance even with the bells & whistles turned on.

The only compromise here is you had to read all of this and I didn’t give you many pretty pictures to look at.

February 21, 2014

NetApp’s latest mid range SPC-1

Filed under: Storage — Tags: , , — Nate @ 4:14 pm

NetApp came out with their latest generation of storage systems recently and they were good citizens and promptly published SPC-1 results for them.

When is clustering, clustering?

NetApp is running the latest Ontap 8.2 in cluster mode I suppose, though there is only a single pair of nodes in the tested cluster. I’ve never really considered this a real cluster, it’s more of a workgroup of systems. Volumes live on a controller (pair) and can be moved around if needed, they probably have some fancy global management thing for the “cluster” but it’s just a collection of storage systems that are only loosely integrated with each other. I like to compare the NetApp style of clustering to a cluster of VMware hosts (where the VMs would be the storage volumes).

This strategy has it’s benefits as well, the main one being less likelihood that the entire cluster could be taken down by a failure(normally I’d consider this failure to be triggered by a software fault). This is the same reason why 3PAR has elected to-date to not go beyond 8-nodes in their cluster, the risk/return is not worth it in their mind. In their latest generation of high end boxes 3PAR decided to double up the ASICs to give them more performance/capacity rather than add more controllers, though technically there is nothing stopping them from extending the cluster further(to my knowledge).

The downside to workgroup style clustering is that optimal performance is significantly harder to obtain.

3PAR clustering is vastly more sophisticated and integrated by comparison. To steal a quote from their architecture document –

The HP 3PAR Architecture was designed to provide cost-effective, single-system scalability through a cache-coherent, multi-node, clustered implementation. This architecture begins with a multi-function node design and, like a modular array, requires just two initial Controller Nodes for redundancy. However, unlike traditional modular arrays, an optimized interconnect is provided between the Controller Nodes to facilitate Mesh-Active processing. With Mesh-Active controllers, volumes are not only active on all controllers, but they are autonomically provisioned and seamlessly load-balanced across all systems resources to deliver high and predictable levels of performance. The interconnect is optimized to deliver low latency, high-bandwidth communication and data movement between Controller Nodes through dedicated, point-to-point links and a low overhead protocol which features rapid inter-node messaging and acknowledgement.

Sounds pretty fancy right? It’s not something that is for high end only. They have extended the same architecture down as low as a $25,000 entry level price point on the 3PAR 7200 (that price may be out of date, it’s from an old slide).

I had the opportunity to ask what seemed to be a NetApp expert on some of the finer details of clustering in Ontap 8.1 (latest version is 8.2) a couple of years ago and he provided some very informative responses.

Anyway on to the results, after reading up on them it was hard for me not to compare them with the now five year old 3PAR F400 results.

Also I want to point out that the 3PAR F400 is End of Life, and is no longer available to purchase as new as of November 2013 (support on existing systems continues for another couple of years).

MetricNetApp
FAS8040
3PAR
F400
Date tested2/19/20144/27/2009
Controller
Count
24
(hey, it's an actual cluster)
SPC-1 IOPS86,07293,050
SPC-1 Usable
Capacity
32,219 GB
(RAID 6)
27,046 GB
(RAID 1)
Raw
Capacity
86,830 GB56,377 GB
SPC-1 Unused
Storage
Ratio
(may not exceed 45%)
41.79%0.03%
Tested storage
configuration
pricing
$495,652$548,432
SPC-1 Cost
per IOP
$5.76$5.89
Disk size and
number
192 x 450GB 10k RPM384 x 146GB 15k RPM
Data Cache64GB data cache
1,024GB Flash cache
24GB data cache

I find the comparison fascinating myself at least. It is certainly hard to compare the pricing, given the 3PAR results are five years old, the 3PAR mid range pricing model has changed significantly with the introduction of the 7000 series in late 2012.  I believe the pricing 3PAR provided SPC-1 was discounted(I can’t find indication either way, I just believe that based on my own 3PAR pricing I got back then) vs NetApp is list(says so in the document). But again, hard to compare pricing given the massive difference in elapsed time between tests.

Unused storage ratio

What is this number and why is there such a big difference? Well this is a SPC-1 metric and they say in the case of NetApp:

Total Unused Capacity (36,288.553 GB) divided by Physical Storage Capacity (86.830.090 GB) and may not exceed 45%.

A unused storage ratio of 42% is fairly typical for NetApp results.

In the case of 3PAR, you have to go to the bigger full disclosure document(72 pages), as the executive summary has evolved more over time and that specific quote is not in the 3PAR side of things.

So for 3PAR F400 SPC says:

The Physical Storage Capacity consisted of 56,377.243 GB distributed over 384 disk drives each with a formatted capacity of 146.816 GB. There was 0.00 GB (0.00%) of Unused Storage within the Physical Storage Capacity. Global Storage Overhead consisted of 199.071 GB (0.35%) of Physical Storage Capacity. There was 61.203 GB (0.11%) of Unused Storage within the Configured Storage Capacity. The Total ASU Capacity utilized 99.97% of the Addressable Storage Capacity resulting in 6.43 GB (0.03%) of Unused Storage within the Addressable Storage Capacity.

3PAR F400 Storage Hierarchy Ratios

3PAR F400 Storage Hierarchy Ratios

The full disclosure document is not (yet) available for NetApp as of 2/21/2014. It most certainly will become available at some point.

The metrics above and beyond the headline numbers is one of the main reasons I like SPC-1.

With so much wasted space on the NetApp side it is confusing to me why they don’t just use RAID 1 (I think the answer is they don’t support it).

Benefits from cache

The NetApp system is able to leverage it’s terabyte of flash cache to accelerate what is otherwise a slower set of 10k RPM disks, which is nice for them.

They also certainly have much faster CPUs, and more than double the data cache (3PAR’s architecture isolates data cache from the operating system, so I am not sure how much memory on the NetApp side is actually used for data cache vs operating system/meta data etc). 3PAR by contrast has their proprietary ASIC which is responsible for most of the magic when it comes to data processing on their systems.

3PAR does not have any flash cache capabilities so they do require (in this comparison) double the spindle count to achieve the same performance results. Obviously in a newer system configuration 3PAR would likely configure a system with SSDs and sub LUN auto tiering to compensate for the lack of a flash based cache. This does not completely completely compensate however, and of course I have been hounding 3PAR and HP for at least four years now to develop some sort of cache technology that leverages flash. They announced SmartCache in December 2012 (host-based SSD caching for Gen8 servers) however 3PAR integration has yet to materialize.

However keep in mind the NetApp flash cache only accelerates reads. If you have a workload like mine which is 90%+ write the flash cache doesn’t help.

Conclusion

NetApp certainly makes good systems, they offer a lot of features, and have respectable performance. The systems are very very flexible and they have a very unified product line up (same software runs across the board).

For me personally after seeing results like this I feel continually reassured that the 3PAR architecture was the right choice for my systems vs  NetApp (or other 3 letter storage companies).  But not everyone’s priorities are the same. I give NetApp props for continuing to support SPC-1 and being public with their numbers. Maybe some day these flashy storage startups will submit SPC-1 results…….not holding my breath though.

December 9, 2013

3PAR: Faster, bigger, better

Filed under: Storage — Tags: — Nate @ 11:53 am

First off, sorry about the lack of posts, there just hasn’t been very much in tech that has inspired me recently. I’m sure part of the reason is my job has been fairly boring for a long time now, so I’m not being exposed to a whole lot of stuff. I don’t mind that trade off for the moment – still a good change of pace compared to past companies. Hopefully 2014 will be a more interesting year.

3PAR still manages to create some exciting news every now and then and they seem to be on a 6-month release cycle now, far more aggressive than they were pre acquisition. Of course now they have far more resources. Their ability to execute really continues to amaze me, whether it is on the sales or on the technology side. I think technical support still needs some work though. In theory that aspect of things should be pretty easy to fix it’s just a matter of spending the $$ to get more good people. All in all though they’ve done a pretty amazing job at scaling 3PAR up, basically they are doing more than 10X the revenue they had before acquisition in just a matter of a few short years.

This all comes from HP Discover – there is a bit more to write about but per usual 3PAR is the main point of interest for myself.

Turbo-charging the 3PAR 7000

Roughly six months ago 3PAR released their all-flash array the 7450. Which was basically a souped up 7400 with faster CPUs, double the memory, optimized software for SSDs and a self imposed restriction that they would only sell it with flash(no spinning rust).

3PAR's Performance and cost improvements for SSDs Dec 2013

3PAR’s Performance and cost improvements for SSDs Dec 2013

At the time they said they were still CPU bound and that their in house ASIC was nowhere near being taxed to the limit. Simultaneously they could not put more (or more powerful) CPUs in the chassis due to cooling restraints in the relatively tiny 2U package that a pair of controllers come in.

Given the fine grained software improvements they released earlier this year I (along with probably most everyone else) was not expecting that much more could be done. You can read in depth details, but highlights included:

  • Adaptive read caching – mostly disabling read caching for SSDs, at the same time disabled prefetching of other blocks. SSDs are so fast that there is little benefit to doing either. Not caching reads to SSDs has a benefit of dedicating more of the cache to writes.
  • Adaptive write caching – with disks 3PAR would write an entire 16kB block to disk because there is no penalty for doing so. With SSDs they are much more selective in only writing the small blocks that changed, they will not write 16kB if only 4kB has changed because there is no penalty with SSDs like there are with disks.
  • Autonomic cache offload – More sophisticated cache management algorithms
  • Multi tenant improvements – Multi threaded cache flushing, breaking up large sequential I/O requests into smaller chunks for the SSDs to ingest at a faster rate. 3PAR has always been about multi tenancy.

Net effect of all of these are more effective IOPS and throughput, more efficiency as well.

With these optimizations, the 7450 was rated at roughly 540,000 IOPS @ 0.6ms read latency (100% read). I guesstimated based on the SPC-1 results from the 7400 that a 7450 could perhaps reach around 410,000 IOPS. Just a guess though..

So imagine my surprise when they come out and say the same system with the same CPUs, memory etc is now performing at a level of 900,000 IOPS with a mere 0.7 milliseconds of latency.

The difference? Better software.

Mid range I/O scalability

Storage
Array
3PAR F200
(2-node)
[EndOfLife]
3PAR
F400
(4-node)
[EndOfLife]
3PAR
7200
(2-node)
3PAR
7400
(4-node)
3PAR
7450
(4-node)
100% Random Read
(Backend, between
disks and controllers)
34,40076,800150,000320,000N/A
100%
Random
Read IOPS
(Front end between
hosts and controllers)
[before
MSI-X
upgrade]
N/AN/A
N/AN/A540,000
100%
Random
Read
(Front end between
hosts and controllers)
[after
MSI-X]
Not possibleNot possibleNot
Yet
Available
Not
Yet
Available
900,000
SPC-1
I/O
~45,000
(guesstimate)
93,050~100,000
(guesstimate)
258,000~700,000
(guesstimate)
100%
Random
Read
Throughput
(Backend, between
disks and controllers)
1.3GB/s2.6GB/s2.5GB/s4.8GB/s5.5GB/s
100%
Random
Read
Throughput
(Front end between
hosts and controllers)
N/AN/AN/AN/AN/A

Stop interrupting me

What allowed 3PAR to reach this level of performance is by leveraging a PCI-express feature called Message Signaled Interrupts, or MSI-X which Wikipedia describes as:

MSI-X (first defined in PCI 3.0) permits a device to allocate up to 2048 interrupts. The single address used by original MSI was found to be restrictive for some architectures. In particular, it made it difficult to target individual interrupts to different processors, which is helpful in some high-speed networking applications. MSI-X allows a larger number of interrupts and gives each one a separate target address and data word. Devices with MSI-X do not necessarily support 2048 interrupts but at least 64 which is double the maximum MSI interrupts.

(tangent time)

I’m not a hardware guy to this depth for sure. But I did immediately recognize MSI-X from a really complicated troubleshooting process I went through several years ago with some Broadcom network chips on Dell R610 servers (though the issue wasn’t Dell specific). It ended up being a bug with how the Broadcom driver was handling(or not) MSI-X (Redhat bug here). It took practically a year of (off and on) troubleshooting before I came across that bug report. The solution was to disable MSI-X via a driver option (which apparently the Dell supplied drivers came with by default, the OS-supplied drivers did not have that disabled by default).

(end tangent)

So some fine grained kernel work improving interrupts gave them a 1.6 fold improvement in performance.

This performance enhancement applies to the SAS-based 3PAR 7000-series only, the 10000-series had equivalent functionality already in place, and the previous generations (F/T platforms) are PCI-X based(and I believe are all in their end of life phases), and this is a PCI Express specific optimization. I think this level of optimization might really only help SSD workloads as they push the controllers to the limit, unlike spinning rust.

This optimization also reduces the latency on the system by 25%, because the CPU is no longer being interrupted nearly as often it can no only do more work but do the work faster too.

Give me more!

There are several capacity improvements here as well.

New SSDs

There are new 480GB and 920GB SSDs available, which takes the 3PAR 4-node 7400/7450 to a max raw capacity of 220TB (up from 96TB) on up to 240 SSDs.

Bigger entry level

The 3PAR 7200’s spindle capacity is being increased by 60% – from 144 drives to 240 drives. The 7200 is equipped with only 8GB of data cache (4GB per controller – it is I believe the first/only 3PAR system with more control cache than data cache), though it still makes a good low cost bulk data platform with support for up to 400TB of raw storage behind two controllers(which is basically the capacity of the previous generation’s 4-node T400 which had 48GB of data cache, 16GB of control cache, 24 CPU cores, 4 ASICs — obviously the T400 had a much higher price point!).

4TB Drives

Not a big shocker here just bigger drives – 4TB Nearline SAS is now supported across the 7k and 10k product lines, bringing the high end 10800 array to support 3.2PB of raw capacity, and the 7400 sporting up to 1.1PB now. These drives are obviously 3.5″ so on the 7000 series you’ll need the 3.5″ drive cages to use them – the 10k line uses 3PAR’s custom enclosures which support both 2.5″ and 3.5″ (though for 2.5″ drives the enclosures are not compact like they are on 7k).

I was told at some point that the 3PAR OS would start requiring RAID 6 on volumes that were on nearline drives at some point – perhaps that point is now(I am not sure). I was also told you would be able to override this at an OS level if you wish, the parallel chunklet architecture recovers from failures far faster than competing architectures. Obviously with the distributed architecture on 3PAR you are not losing any spindles to dedicated spares nor dedicated parity drives.

If you are really paranoid about disk failures you can on a per-volume basis if you wish use quadruple mirroring on a 3PAR system – which means you can lose up to 75% of the disks in the system and still be OK on those volume(s).

3PAR also uses dynamic sparing –  if the default spare reserve space runs out, and you have additional unwritten capacity(3PAR views capacity as portions of drives, not whole drives) the system can sustain even more disk failures without data loss or additional overhead of re-configuration etc.

Like almost all things on 3PAR the settings can be changed on the fly without application impact and without up front planning or significant effort on the part of the customer.

More memory

The 3PAR 10400 has received a memory boost – doubling it’s memory configuration from the original configuration. Basically it seems like they decided it was a better idea to unify the 10800 and 10400 controller configurations, though the data sheet seems to have some typos in it(pending clarification). I believe the numbers are 96GB of cache per controller (64GB data, 32GB control), giving a 4-node system 384GB of memory.

Compare this to the  7400 which has 16GB of cache per controller (8GB data, 8GB control) giving a 4-node system 64GB of memory. The 10400 has six times the cache, and still supports 3rd party cabinets.

Now if they would just double the 7200 and 7400’s memory that would be nice 🙂

Keeps getting better

Multi tenant improvements

Six months ago 3PAR released their storage quality of service software offering called Priority Optimization. As mentioned before 3PAR has always been about multi tenancy, and due to their architecture they have managed to do a better job at it than pretty much anyone else. But it still wasn’t perfect obviously – there was a need for real array based QoS. They delivered on that earlier this year and now have announced some significant improvements on that initial offering.

Brief recap of what their initial release was about – you were able to define both IOP and bandwidth threshold levels for a particular volume(or group of volumes), the system would respond basically in real time to throttle the workload if it exceeded that level. 3PAR has tons of customers that run multi tenant configurations so they went further in being able to define both a customer as well as an application.

Priority Optimization

Priority Optimization

So as you can see from the picture above, the initial release allowed you to specify say 20,000 IOPS for a customer, and be able to over provision IOPS for individual applications that customer uses, allowing for maximum flexibility, efficiency and control at the same time.

So the initial release was all about basically rate limiting workloads on a multi tenant system. I suppose you could argue that there wasn’t a lot of QoS it was more rate limiting.

The new software is more QoS oriented – going beyond rate limiting they now have three new capabilities:

  • Allows you to specify a performance minimum threshold for a given application/customer
  • Allows you to specify a latency target for a given application
  • Using 3PAR’s virtual domains feature(basically carve a 3PAR up into many different virtual arrays for service providers) you can now assign a QoS to a given virtual domain! That is really cool.
3PAR Priority Optimization: true array based QoS

3PAR Priority Optimization: true array based QoS

Like almost everything 3PAR – configuring this is quite simple and does not require professional services.

3PAR Replication: M to N

With the latest software release 3PAR now supports M to N topologies for replication. Before this they supported 1 to 1, as well as synchronous long distance replication

3PAR Synchronous long distance replication: unique in the mid range

3PAR Synchronous long distance replication: unique in the mid range

All configurable via point and click interface no less, no professional services required.

New though is M to N.

Five 3PAR storage arrays in an any-any bi-directional M:N replication scheme

Five 3PAR storage arrays in an any-any bi-directional M:N replication scheme

Need Bigger? How about nine arrays all in some sort of replication party? That’s a lot of arrows.

*NINE* 3PAR storage arrays in an any-any bi-directional M:N replication scheme

*NINE* 3PAR storage arrays in an any-any bi-directional M:N replication scheme

3PAR Multi array replication

3PAR Multi array replication

More scalable replication

On top of the new replication topology they’ve also tripled(or more) the various limits around the maximum number of volumes that can be replication in the various modes. A four node 3PAR can now replicate up to a maximum of 6,000 volumes in asynchronous mode and 2,400 volumes in synchronous mode.

You can also run up to 32 remote copy fibre channel links per system and up to 8 remote copy over IP links per system (RCIP links are dedicated 1GbE ports on each controller).

Peer motion enhancements

Peer motion is 3PAR’s data mobility package which allows you to transparently move volumes between arrays. It came out a few years ago primarily as a means to provide ease of migration/upgrade between 3PAR systems, and was later extended to support EVA->3PAR migrations.  HP’s StoreVirtual platform also does peer motion, though as far as I know it is not yet directly inter-operable with 3PAR. Not sure if it ever will be.

Anyway like most sophisticated things there are always caveats – the most glaring of which in peer motion is they did not support SCSI reservations. Which basically means you couldn’t use peer motion with VMware or other clustering software. With the latest software that limitation has been removed! VMware, Microsoft and Redhat clustering are all supported now.

Persistent port enhancements

Persistent ports is an availability feature 3PAR introduced about a year ago which basically leverages NPIV at the array level – it allows a controller to assume the Fibre Channel WWNs of it’s peer in the event the peer goes offline. This means fail over is much faster, and it removes the dependency of multi pathing software to provide for fault tolerance. That’s not to say that you should not use MPIO software you still should if for nothing else other than better distribution of I/O across multiple HBAs, ports and controllers. But the improved recovery times are a welcome plus.

3PAR Persistent ports - transparent fail over for FC/iSCSI/FCoE without MPIO

3PAR Persistent ports – transparent fail over for FC/iSCSI/FCoE without MPIO

So what’s new here?

  • Added support for FCoE and iSCSI connections
  • Laser loss detection – in the event a port is disconnected persistent ports kick in (don’t need to have a full controller failure)
  • The speed at which the fail over kicks in has been improved

Combine Persistent Ports with 3PAR Persistent cache on a 4-8 controller system and you have some pretty graceful fail over capabilities.

3PAR Persistent Cache mirrors cache from a degraded controller pair to another pair in the cluster automatically.

3PAR Persistent Cache mirrors cache from a degraded controller pair to another pair in the cluster automatically.

3PAR Persistent Cache was released back in 2010 I believe, no updates here, just put the reference here for people that may not know what it is since it is a fairly unique ability to have especially in the mid range.

More Secure

Also being announced is a new set of FIPS 140-2 validated self encrypting drives with sizes ranging from 450GB 10k to 4TB nearline.

3PAR also has a 400GB SSD encrypting drive as well though I don’t see any mention of FIPS validation on that unit.

3PAR arrays can either be encrypted or not encrypted – they do not allow you to mix/match. Also once you enable encryption on a 3PAR array it cannot be disabled.

I imagine you probably aren’t allowed to use Peer Motion to move data from an encrypted to a non encrypted system? Same goes for replication ? I am not sure, I don’t see any obvious clarifications in the docs.

Adaptive Sparing

SSDs, like hard drives all come with a chunk of hidden storage set aside for when blocks wear out or go bad, the disk transparently re-maps from this spare pool. I think SSDs take it to a new level with their wear leveling algorithms.

Anyway, 3PAR’s Adaptive sparing basically allows them to utilize some of the storage from this otherwise hidden pool on the SSDs. The argument is 3PAR is already doing sparing at the sub-disk (chunklet) level, if a chunklet fails then it is reconstructed on the fly – much like a SSD would do to itself if a segment of flash went bad. If too many chunklets fail over time on a disk/SSD the system will pro actively fail the device.

3PAR Adaptive Sparing: gain capacity without sacrificing availbility

3PAR Adaptive Sparing: gain capacity without sacrificing availbility

At the end of the day the customer gets more usable capacity out of the system without sacrificing any availability. Given the chunklet architecture I think this approach is probably going to be a fairly unique capability.

Lower cost SSDs

Take Adaptive sparing, and combine it with the new SSDs that are being released and you get SSD list pricing(on a per GB basis) which is reduced by 50%. I’d really love to see an updated SPC-1 for the 7450 with these new lower cost devices(plus MSI-X enhancements of course!), I’d be surprised if they weren’t working on one already.

Improved accessibility

3PAR came out with their first web services API a year ago. They’ve since improved upon that, as well as adding enhancements for Openstack Havana (3PAR was the reference implementation for Fibre Channel in Openstack).

They’ve also added management/monitoring tools for both IOS and Android (looks at HP Pre3 and cries in his mind).

HP 3PAR Storefront: Monitor your 3PAR from your mobile device

HP 3PAR Storefront: Monitor your 3PAR from your mobile device

Screaming sales

3PAR is continuing to kick butt in the market place with their 7000-series, with El Reg reporting that their mid range products have had 300% year over year increases in sales and they have overtaken IBM and NetApp in market share to be #2 behind EMC (23% vs 17%).

This might upset the ethernet vendors but they also report that fibre channel is the largest and fastest growing storage protocol in the mid range space(at least year over year), I’m sure again largely driven by 3PAR who historically has been a fibre channel system. Fibre channel has 50% market share with 49% year over year growth.

What’s missing

Well the elephant in the room that is still not here is some sort of SSD-based caching. HP went so far as to announce something roughly a year ago with their SmartCache technology for Gen8 systems, though they opted not to mention much on that this time around. It’s something I have hounded 3PAR for the past four years to get going, I’m sure they are working on something……

Also I would like to see them support, or at least explain why they might not support, the Seagate Enterprise Turbo SSHD – which is a hybrid drive providing 32GB of eMLC flash cache in front of what I believe is an otherwise 10k RPM 300-600GB disk with self proclaimed upwards of 3X improvement in random I/O over 15k disks. There’s even a FIPS 140-2 model available. I don’t know what the price point of this drive is but find it hard to believe that it’s not a cost effective alternative to flash tiering when you do not have a flash-based cache to work off of.

Lastly I would like to see some sort of automatic workload load balancing with Peer motion – as far as I know that does not yet exist. Though moving TBs of data around between arrays is not something to be taken lightly anyway!

August 2, 2013

HP Storage Tech Day – bits and pieces

Filed under: Storage — Tags: , , — Nate @ 9:56 am

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

For my last post on HP Storage tech day, the remaining topics that were only briefly covered at the event.

HP Converged Storage Management

There wasn’t much here other than a promise to build YASMT (Yet another storage management tool), this time it will be really good though. HP sniped at EMC on several occasions for the vapor-ness of ViPR. Though at least that is an announced product with a name. HP has a vision, no finalized name, no product(I’m sure they have something internally) and no dates.

I suppose if your in the Software defined storage camp which is for the separation of data and control plane, this may be HP’s answer to that.

HP Converged Storage Management Strategy

HP Converged Storage Management Strategy

The vision sounds good as always, time will tell if they can pull it off. The track record for products like this is not good. More often than not the tools lower the bar on what is supported to some really basic set of things, and are not able to exploit the more advanced features of the platform under management.

One question I did ask is whether or not they were going to re-write their tools to leverage these new common APIs, and the answer was sort of what I expected – they aren’t. At least short term the tools will use a combination of these new APIs as well as whatever methods they use today. So this implies that only a subset of functionality will be available via the APIs.

In contrast I recall reading something, perhaps a blog post, about how NetApp’s tools use all of their common APIs(I believe end to end API stuff for them is fairly recent). HP may take a couple of years to get to something like that.

HP Openstack Integration

HP is all about the Openstack. They seem to be living and breathing it. This is pretty neat, I think Openstack is a good movement, though the platform still seems some significant work to mature.

I have concerns, short term concerns about HP’s marketing around Openstack and how easy it is to integrate into customer environments. Specifically Openstack is a fast moving target, lacks maturity and at least as recently as earlier this year lacked a decent community of IT users (most of it was centered on developers – probably still is). HP’s response is they are participating deeply within the community (which is good long term), and are being open about everything (also good).

I specifically asked if HP was working with Red Hat to make sure the latest HP contributions (such as 3PAR support, Fibre Channel support) were included in the RH Open Stack. They said no, they are working with the community, and not partners. This is of course good and bad. Good that they are being open, bad in that it may result in some users not getting things for 12-24 months because the distribution of Openstack they chose is too old to support it.

I just hope that Openstack matures enough that it gets a stable set of interfaces. Unlike say the Linux kernel driver interfaces which just annoy the hell out of me(have written about that before). Compatibility people!!!

Openstack Fibre Channel support based on 3PAR

HP wanted to point out that the Fibre Channel support in Openstack was based on 3PAR. It is a generic interface and there are plugins for a few different array types. 3PAR also has iSCSI support for Openstack as of a recent 3PAR software release as well.

StoreVirtual was first Openstack storage platform

Another interesting tidbit is that Storevirtual was the first(?) storage platform to support Openstack. Rackspace used it(maybe still does, not sure), and contributed some stuff to make it better. HP also uses it in their own public cloud(not sure if they mentioned this or not but I heard from a friend who used to work in that group).

HP Storage with Openstack

Today HP integrates with Openstack at the block level on both the StoreVirtual and 3PAR platforms. Work is in progress for StoreAll which will provide file and object storage. Fibre channel support is available on the 3PAR platform only as far as HP goes. StoreVirtual supports Fibre Channel but not with Openstack(yet anyway, I assume support is coming).

This contrasts with the competition, most of whom have no Openstack support and haven’t announced anything to be released anytime soon. HP certainly has a decent lead here, which is nice.

HP Openstack iSCSI/FC driver functionality

All of HP’s storage work with Openstack is based on the Grizzly release which came out around April 2013.

  • Create / Delete / Attach / Detach volumes
  • Create / Delete Snapshots
  • Create volume from snapshot
  • Create cloned volumes
  • Copy image to volume / Copy volume to image (3PAR iSCSI only)

New things coming in Havana release of Openstack from HP Storage

  • Better session management within the HP 3PAR StoreServ Block Storage Drivers
  • Re-use of existing HP 3PAR Host entries
  • Support multiple 3PAR Target ports in HP 3PAR StoreServ Block Storage iSCSI Driver
  • Support Copy Volume To Image & Copy Image To Volume with Fibre Channel Drivers (Brick)
  • Support Quality of Service (QoS) setting in the HP 3PAR StoreServ Block Storage Drivers
  • Support Volume Sets with predefined QoS settings
  • Update the hp3parclient that is part of the Python Standard Library

Fibre channel enhancements for Havana and beyond

Fibre Channel enhancements for Openstack Havana and beyond

Fibre Channel enhancements for Openstack Havana and beyond

Openstack portability

This was not at the Storage Tech Day – but I was at a break out session that talked about HP and Openstack at the conference and one of the key points they hit on was the portable nature of the platform, run it in house, run it in cloud system, run it at service providers and move your workloads between them with the same APIs.

Setting aside a moment the fact that the concept of cloud bursting is a fantasy for 99% of organizations out there(your applications have to be able to cope with it, your not likely going to be able to scale your web farm and burst into a public cloud when those web servers have to hit databases that reside over a WAN connection the latency hit will make for a terrible experience).

Anyway setting that concept aside – you still have a serious problem- short term of compatibility across different Openstack implementations because different vendors are choosing different points to base their systems off of. This is obviously due to the fast moving nature of the platform and when the vendor decides to launch their project.

This should stabilize over time, but I felt the marketing message on this was a nice vision, it just didn’t represent any reality I am aware of today.

HP contrasted this to being locked in to say the vCloud API. I think there are more clouds public and private using vCloud than Openstack at this point. But in any case I believe use cases for the common IT organization to be able to transparently leverage these APIs to burst on any platform- VMware, Openstack, whatever – is still years away from reality.

If you use Openstack’s API, you’re locked into their API anyway. I don’t leverage APIs myself(directly) I am not a developer – so I am not sure how easy it is to move between them. I think the APIs are likely much less of a barrier than the feature set of the underlying cloud in question. Who cares if the API can do X and Y, if the provider’s underlying infrastructure doesn’t yet support that capability.

One use case that could be done today, that HP cited, is running development in a public cloud then pulling that back in house via the APIs. Still that one is not useful either. The amount of work involved in rebuilding such an environment internally should be fairly trivial anyway(the bulk of the work should be in the system configuration area, if your using cloud you should also be using some sort of system management tool, whether it is something like CFEngine, Puppet, Chef, or something similar). That and – this is important in my experience – development environments tend to not be resource intensive. Which makes it great to consolidate them on internal resources (even ones that share with production – I have been doing this since for six years already).

My view on Openstack

At least one person at HP I spoke with believes most stuff will be there by the end of this year but I don’t buy that for a second. I look at things like Red Hat’s own Openstack distribution taking seemingly forever to come out(I believe it’s a few months behind already and I have not seen recent updates on it), and Rackspace abandoning their promise to support 3rd party Open stack clouds earlier this year.

All of what I say is based on what I read — I have no personal experience with Openstack (nor do I plan to get immediate experience, the lack of maturity of the product is keeping me away for now). Based on what I have read, conferences(was at a local Red Hat conference last December where they covered Openstack – that’s when reality really hit me and I learned a good deal about it and honestly lost some enthusiasm in the project) and some folks I have chatted/emailed with Openstack is still a VERY much work in progress, evolving quickly. There’s really no formal support community in place for a stable product, developers are wanting to stay at the leading edge and that’s what they are willing to support. Red Hat is off in one corner trying to stabilize the Folsum release from last year to make a product out of it, HP is in another corner contributing code to the latest versions of Openstack that may or may not be backwards compatible with Red Hat or other implementations.

It’s a mess.. it’s a young project still so it’s sort of to be expected. Though there are a lot of folks making noise about it. The sense I get is if you are serious about running an Open Stack cloud today, as in right now, you best have some decent developers in house to help manage and maintain it. When Red Hat comes out with their product, it may solve a bunch of those issues, but still it’ll be a “1.0”, and there’s always some not insignificant risk to investing in that without a very solid support structure inside your organization (Red Hat will of course provide support but I believe that won’t be enough for most).

That being said it sounds like Openstack has a decent future ahead of it – with such large numbers of industry players adopting support for it, it’s really only a matter of time before it matures and becomes a solid platform for the common IT organization to be able to deploy.

How much time? I’m not sure. My best guesstimate is I hope it can reach that goal within five years. Red Hat, and others should be on versions 3 and perhaps 4 by then. I could see someone such as myself starting to seriously dabble in it in the next 12-16 months.

Understand that I’m setting the bar pretty high here.

Last thoughts on HP Storage Tech Day

I had a good time, and thought it was a great experience. They had very high caliber speakers, were well organized and the venue was awesome as well. I was able to drill them pretty good, the other bloggers seemed to really appreciate that I was able to drive some of the technical conversations. I’m sure some of my questions they would of rather not of answered since the answers weren’t always “yes we’ve been doing that forever..!”, but they were honest and up front about everything. When they could not be, they said so(“can’t talk about that here we need a Nate Disclosure Agreement”).

I haven’t dealt much at all with the other internal groups at HP, but I can say the folks I have dealt with on the storage side have all been AWESOME. Regardless of what I think about whatever storage product they are involved with they are all wonderful people both personally and professionally.

These past few posts have been entirely about what happened on Monday.  There are more bits that happened at the main conference on Tues-Thur, and once I get the slides for those slide decks I’ll be writing more about that, there were some pretty cool speakers. I normally steer far clear of such events, this one was pretty amazing though. I’ll save the details for the next posts.

I want to thank the team at HP, and Ivy Worldwide for organizing/sponsoring this event – it was a day of nothing but storage (and we literally ran out of time at the end, one or two topics had to be skipped). It was pretty cool. This is the first event I’ve ever traveled for, and the only event where there was some level of sponsorship (as mentioned HP covered travel, lodging and food costs).

July 31, 2013

HP Storage Tech Day – StoreAll, StoreVirtual, StoreOnce

Filed under: Storage — Tags: , , , , — Nate @ 5:31 pm

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

On Monday I attended a private little HP Storage Tech Day event here in Anaheim for a bunch of bloggers. They also streamed it live, and I believe the video is available for download as well.

I wrote a sizable piece on the 3PAR topics which were covered in the morning, here I will try to touch on the other HP Storage topics.

HP StoreAll + Express Query

StoreAll Platform

HP doesn’t seem to talk about this very much, and as time goes on I have understood why. It’s not a general purpose storage system, I suppose it never has been (though I expect Ibrix tried to make it one in their days of being independent). They aren’t going after NetApp or EMC’s enterprise NAS offerings. It’s not a platform you want to run VMs on top of. Not a platform you want to run databases on top of. It may not even be a platform you want to run HPC stuff on top of. It’s built for large scale bulk file and object storage.

They have tested scalability to 1,024 nodes and 16PB within a single name space. The system can scale higher, that’s just the tested configuration. They say it can scale non disruptively and re-distribute existing data across new resources as those resources are added to the system. StoreAll can go ultra dense with near line SAS drives going up to roughly 500 drives in a rack (without any NAS controllers).

It’s also available in a gateway version which can go in front of 3PAR, EVA and XP storage.

They say their main competition is EMC Isilon, which is another scale-out offering.

HP seems to have no interest in publishing performance metrics, including participating in SPECsfs (a benchmark that sorely lacks disclosures). The system has no SSD support at all.

The object store and file store, if I am remembering right, are not shared. So you have to access your data via a consistent means. You can’t have an application upload data to an object store then give a user the ability to browse to that file using CIFS or NFS. To me this would be an important function to serve if your object and file stores are in fact on the same platform.

By contrast, I would expect a scale out object store to do away with the concept of disk-based RAID and go with object level replication instead. Many large scale object stores do this already. I believe I even read in El Reg that HP Labs is working on something similar(nothing around that was mentioned at the event). In StoreAll’s case they are storing your objects in RAID, but denying you the flexibility to access them over file protocols which is unfortunate.

From a competitive standpoint, I am not sure what features HP may offer that are unique from an object storage perspective that would encourage a customer to adopt StoreAll for that purpose. If it were me I would probably take a good hard look at something like Red Hat Storage server(I would only consider RHSS for object storage, nothing else) or other object offerings if I was to build a large scale object store.

Express Query (below) cannot currently run on object stores at this time, it will with a future release though.

Express Query

This was announced a while back, which is what seems to be a SQL database of sorts that is running on the storage controllers, with some hooks into the file system itself. It provides indexes of common attributes as well as gives the user the ability to provide custom attributes to search by. As a result, obviously you don’t have to search the entire file system to find files that match these criteria.

It is exposed as a Restful API which has it’s ups and downs. As an application developer you can take this and tie it into your application. It returns results in JSON format (developer friendly, hostile to users such as myself – remember my motto “if it’s not friends with grep or sed, it’s not friends with me”).

The concept is good, perhaps the implementation could use some more work, as-is it seems very developer oriented. There is a java GUI app which you can run that can help you build and submit a query to the system which is alright. I would like to see a couple more things:

  • A website on the storage system (with some level of authentication – you may want to leave some file results open to the “public” if those are public shares) that allows users to quickly build a query using a somewhat familiar UI approach.
  • A drop in equivalent to the Linux command find. It would only support a subset of functionality but you could get a large portion of that functionality I believe fairly simply with this. The main point being don’t make the users have to make significant alterations to their processes to adopt this feature, it’s not complicated, lower the bar for adoption.

To HP’s credit they have written some sort of plugin to the Windows search application that gives windows users the ability to easily use Express Query(I did not see this in action). Nothing so transparent exists for Linux though.

My main questions though were things HP was unable to answer. I expected more from HP on this front in general. I mean specifically around competitive info. In many cases they seem to rely on the publicly available information on the competition’s platforms – maybe limited to the data that is available on the vendor website. HP is a massive organization with very deep pockets – you may know where I’m going here.  GO BUY THE PRODUCTS! Play with them, learn how they work, test them, benchmark them. Get some real world results out of the systems. You have the resources, you have the $$. I can understand back when 3PAR was a startup company they may not be able to go around buying arrays from the competition to put them through their paces. Big ‘ol HP should be doing that on a regular basis. Maybe they are — if they are — they don’t seem to be admitting that their data is based on that(in fact I’ve seen a few times where they explicitly say the information is only from data sheets etc).

Another approach might be, if HP lacks the man power to run such tests and stuff, to have partners or customers do it for them. Offer to subsidize the cost of some purchase by some customer of some competitive equipment in exchange for complete open access to competitive information as a result of using such a system. Or fully cover the cost.. HP has the money to make it happen. It would be really nice to see..

So in regards to Express Query I had two main questions about performance related to the competition. HP says they view Isilon as the main competition for StoreAll.  A couple of years back Isilon started offering a product(maybe it is standard across the board now I am not sure) where they stored the metadata in SSD. This would dramatically accelerate these sorts of operations, without forcing the user to change their way of doing things. Lowers the bar of adoption. Now price wise it probably costs more, StoreAll does not have any special SSD support whatsoever. But I would be curious as to the performance in general comparing Isilon’s accelerated metadata vs HP Express query.  Obviously Express Query is more flexible with it’s custom meta data fields and tagging etc, so for specific use cases there is no comparison. BUT.. for many things I think both would work well..

Along the same notes – back when I was a BlueArc customer one of their big claims was their FPGA accelerated file system had lightning fast meta data queries. So how does Express Query performance compare to something like that? I don’t know, and got no answer from HP.

Overall

Overall, I’d love it if HP had a more solid product in this area, it feels like whoever I talk to that they have just given up trying to be competitive with an in house enterprise file offering(they do have a file offering based on Windows storage server but I don’t really consider that in the same ballpark since they are just re-packaging someone else’s product). HP StoreAll has it’s use cases and it probably does those fairly well, but it’s not a general purpose file platform, and from the looks of things it’s never going to be one.

Software Defined Storage

Just hearing the phrase Software Defined <anything> makes me shudder. Our friends over at El Reg have started using the term hype gasm when referring to Software Defined Networking. I believe the SDS space is going to be even worse, at least for a while.

(On a side note there was an excellent presentation on SDN at the conference today which I will write about once I have the password to unlock the presentations so I can refresh my memory on what was on the slides – I may not get the password until Friday though)

As of today, the term is not defined at all. Everyone has their own take on it, and that pisses me off as a technical person. From a personal standpoint I am in the camp leaning more towards some separation of data and control planes ala SDN, but time will tell what it ends up being.

I think Software Defined Storage, if it is not some separation of control and data plan instead could just be a subsystem of sorts that provides storage services to anything that needs them. In my opinion it doesn’t matter if it’s from a hardware platform such as 3PAR, or a software platform such as a VSA. The point is you have a pool of storage which can be provisioned in a highly dynamic & highly available manor to whatever requests it. At some point you have to buy hardware – having infrastructure that is purpose built, and shared is obviously a commonly used strategy today. The level of automation and API type stuff varies widely of course.

The point here is I don’t believe the Software side of things means it has to be totally agnostic as to where it runs – it just needs a standard interfaces in which anything can address it(APIs, storage protocols etc). It’s certainly nice to have the ability to run such a storage system entirely as a VM, there are use cases for that for certain. But I don’t want to limit things to just that. So perhaps more focus on the experience the end user gets rather than how you get there. Something like that.

StoreVirtual

HP’s take on it is of course basically storage resources provisioned through VSAs. Key points being:

  • Software only (they also offer a hardware+software integrated package so…)
  • Hypervisor agnostic (right now that includes VMware and Hyper-V so not fully agnostic!)
  • Federated

I have been talking with HP off and on for months now about how confusing I find their messaging around this.

In one corner we have:

3PAR Eliminating boundaries.

3PAR Eliminating boundaries.

In the other corner we have

Software Defined Data Center - Storage

Software Defined Data Center – Storage (3PAR is implied to be Service Refined Storage – Storevirtual is Cost Optimized)

Store Virtual key features

Store Virtual key features

Mixed messages

(thinking from a less technical user’s perspective – specifically thinking back to some of my past managers who thought they knew what they were doing when they really didn’t – I’m looking out for other folks like me in the field who don’t want their bosses to get the wrong message when they see something like this)

What’s the difference between 3PAR’s SLA Optimized storage when value matters, and StoreVirtual Cost Optimized?

Hardware agnostic and federated sounds pretty cool, why do I need 3PAR when I can just scale out with StoreVirtual? Who needs fancy 3PAR Peer Persistence (fancy name for transparent full array high availability) when I have built in disaster recovery on the StoreVirtual platform?

Expensive 3PAR software licensing? StoreVirtual is all inclusive! The software is the same right? I can thin provision, I can snapshot, I can replicate, I can peer motion between StoreVirtual systems. I have disaster recovery, I have iSCSI, I have Fibre channel. I have scale out and I have a fancy shared-nothing design. I have Openstack integration. I have flash support, I have tiering, I have all of this with StoreVirtual. Why did HP buy 3PAR when they already have everything they need for the next generation of storage?

(stepping back into technical role now)

Don’t get me wrong – I do see some value in the StoreVirtual platform! It’s really flexible, easy to deploy, and can do wonders to lower costs in certain circumstances – especially branch office type stuff. If you can deploy 2-3 VM servers at an edge office and leverage HA shared storage without a dedicated array I think that is awesome.

But the message for data center and cloud deployments – do I run StoreVirtual as my primary platform or run 3PAR ?  The messaging is highly confusing.

My idea to HP on StoreVirtual vs. 3PAR

I went back and forth with HP on this and finally, out of nowhere I had a good idea which I gave to them and it sounds like they are going to do something with it.

So my idea was this – give the customer a set of questions, and based on the answers of those questions HP would know which storage system to recommend for that use case. Pretty simple idea. I don’t know why I didn’t come up with it earlier (perhaps because it’s not my job!). But it would go a long way in cleaning up that messaging around which platform to use. Perhaps HP could take the concept even further and update the marketing info to include such scenarios (I don’t know how that might be depicted, assuming it can be done so in a legible way).

When I gave that idea, several people in the room liked it immediately, so that felt good 🙂

 HP StoreOnce

(This segment of the market I am not well versed in at all, so my error rate is likely to go up by a few notches)

HP StoreOnce is of course their disk-based scale-out dedupe system developed by HP Labs. One fairly exciting offering in this area that was recently announced at HP Discover is the StoreOnce VSA. Really nice to see the ability to run StoreOnce as a VM image for small sites.

They spent a bunch of time talking about how flexible the VSA is, how you can vMotion it and Storage vMotion it like it was something special. It’s a VM, it’s obvious you should be able to do those things without a second thought.

StoreOnce is claimed to have a fairly unique capability of being able to replicate between systems without ever re-hydrating the data. They also claim a unique ability to be the first platform to offer real high availability. In a keynote session by David Scott (which I will cover in more depth in a future post once I get copies of those presentations) he mentioned that Data Domain as an example, if a DD controller fails during a backup or a restore operation the operation fails and must be restarted after the controller is repaired.

This surprised me – what’s the purpose of dual controllers if not to provide some level of high availability? Again forgive my ignorance on the topic as this is not an area of the storage industry I have spent much time at all in.

HP StoreOnce however can recover from this without significant downtime. I believe the backup or restore job still must be restarted from scratch, but you don’t have to wait for the controller to be repaired to continue with work.

HP has claimed for at least the past year to 18 months now that their performance far surpasses everyone else by a large margin, they continued those claims this week. I believe I read at one point from their competition that the claims were not honest in that I believe the performance claims was from a clustered StoreOnce system which has multiple de-dupe domains(meaning no global dedupe on the system as a whole), and it was more like testing multiple systems in parallel against a single data domain system(with global dedupe). I think there were some other caveats as well but I don’t recall specifics (this is from stuff I want to say I read 18-20 months ago).

In any case, the product offering seems pretty decent, is experiencing a good amount of growth and looks to be a solid offering in the space. Much more competitive in the space than StoreAll is, probably not quite as competitive as 3PAR, perhaps a fairly close 2nd as far as strength of product offering.

Next up, converged storage management and Openstack with HP. Both of these topics are very light relative to the rest of the material, I am going to go back to the show room floor to see if I can dig up more info.

 

July 30, 2013

HP Storage Tech Day – 3PAR

Filed under: Events,Storage — Tags: , , — Nate @ 2:04 am

Before I forget again..

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

So, HP hammered us with a good seven to eight hours of storage related stuff today, the bulk of the morning was devoted to 3PAR and the afternoon covered StoreVirtual, StoreOnce, StoreAll, converged management and some really technical bits from HP Labs.

This post is all about 3PAR. They covered other topics of course but this one took so long to write I had to call it a night, will touch on the other topics soon.

I won’t cover everything since I have covered a bunch of this in the past. I’ll try not to be too repetitive…

I tried to ask as many questions as I could, they answered most .. the rest I’ll likely get with another on site visit to 3PAR HQ after I sign another one of those Nate Disclosure Agreements (means I can’t tell you unless your name is Nate). I always feel guilty about asking questions directly to the big cheeses at 3PAR. I don’t want to take up any of their valuable time…

There wasn’t anything new announced today of course, so none of this information is new, though some of is new to this blog, anyway!

I suppose if there is one major take away for me for this SSD deep dive, is the continued insight into how complex storage really is, and how well 3PAR does at masking that complexity and extracting the most of everything out of the underlying hardware.

Back when I first started on 3PAR in late 2006, I really had no idea what real storage was. As far as I was concerned one dual controller system with 15K disks was the same as the next. Storage was never my focus in my early career (I did dabble in a tiny bit of EMC Clariion (CX6/700) operations work – though when I saw the spreadsheets and visios the main folks used to plan and manage I decided I didn’t want to get into storage), it was more servers, networking etc.

I learned a lot in the first few years of using 3PAR, and to a certain extent you could say I grew up on it. As far as I am concerned being able to wide stripe, or have mesh active controllers is all I’ve ever (really) known. Sure since then I have used a few other sorts of systems. When I see architectures and processes of doing things on other platforms I am often sort of dumbfounded why they do things that way. It’s sometimes not obvious to me that storage used to be really in the dark ages many years ago.

Case in point below, there’s a lot more to (efficient, reliable, scalable, predictable) SSDs than just tossing a bunch of SSDs into a system and throwing a workload at them..

I’ve never tried to proclaim I am a storage expert here(or anywhere) though I do feel I am pretty adept at 3PAR stuff at least, which wasn’t a half bad platform to land on early on in the grand scheme of things. I had no idea where it would take me over the years since. Anyway, enough about the past….

New to 3PAR

Still the focus of the majority of HP storage related action these days, they had a lot to talk about. All of this initial stuff isn’t there yet(up until the 7450 stuff below), just what they are planning for at some point in the future(no time frames on anything that I recall hearing).

Asynchronous Streaming Replication

Just a passive mention of this on a slide, nothing in depth to report about, but I believe the basic concept is instead of having asynchronous replication running on snapshots that kick off every few minutes (perhaps every five minutes) the replication process would run much more frequently (but not synchronous still), perhaps as frequent as every 30 seconds or something.

I’ve never used 3PAR replication myself. Never needed array based replication really. I have built my systems in ways that don’t require array based replication. In part because I believe it makes life easier(I don’t build them specifically to avoid array replication it’s merely a side effect), and of course the license costs associated with 3PAR replication are not trivial in many circumstances(especially if your only needing to replicate a small percentage of the data on the system). The main place where I could see leveraging array based replication is if I was replicating a large number of files, doing this at the block layer is often times far more efficient(and much faster) than trying to determine changed bits from a file system perspective.

I wrote/built a distributed file transfer architecture/system for another company a few years ago that involved many off the shelf components(highly customized) that was responsible for replicating several TB of data a day between WAN sites, it was an interesting project and proved to be far more reliable and scalable than I could of hoped for initially.

Increasing Maximum Limits

I think this is probably out of date, but it’s the most current info I could dig up on HP’s site. Though this dates back to 2010. These pending changes are all about massively increasing the various supported maximum limits of various things. They didn’t get into specifics. I think for most customers this won’t really matter since they don’t come close to the limits in any case(maybe someone from 3PAR will read this and send me more up to date info).

3PAR OS 2.3.1 supported limits

3PAR OS 2.3.1 supported limits(2010)

The PDF says updated May 2013, though the change log says last update is December. HP has put out a few revisions to the document(which is the Compatibility Matrix) which specifically address hardware/software compatibility, but the most recent Maximum Limits that I see are for what is now considered quite old – 2.3.1 release – this was before their migration to a 64-bit OS (3.1.1).

Compression / De-dupe

They didn’t talk about it, other than mention it on a slide, but this is the first time I’ve seen HP 3PAR publicly mention the terms. Specifically they mention in-line de-dupe for file and block, as well as compression support. Again, no details.

Personally I am far more interested in compression than I am de-dupe. De-dupe sounds great for very limited workloads like VDI(or backups, which StoreOnce has covered already). Compression sounds like a much more general benefit to improving utilization.

Myself I already get some level of “de duplication” by using snapshots. My main 3PAR array runs roughly 30 MySQL databases entirely from read-write snapshots, part of the reason for this is to reduce duplicate data, another part of the reason is to reduce the time it takes to produce that duplicate data for a database(fraction of a second as opposed to several hours to perform a full data copy).

File + Object services directly on 3PAR controllers

No details here other than just mentioning having native file/object services onto the existing block services. They did mention they believe this would fit well in the low end segment, they don’t believe it would work well at the high end since things can scale in different ways there. Obviously HP has file/object services in the IBRIX product (though HP did not get into specifics what technology would be used other than taking tech from several areas inside HP), and a 3PAR controller runs Linux after all, so it’s not too far fetched.

I recall several years ago back when Exanet went bust, I was trying to encourage 3PAR to buy their assets as I thought it would of been a good fit. Exanet folks mentioned to me that 3PAR engineering was very protective of their stuff and were very paranoid about running anything other than the core services on the controllers, it is sensitive real estate after all.  With more recent changes such as supporting the ability to run their reporting software(System Reporter) directly on the controller nodes I’m not sure if this is something engineering volunteered to do themselves or not. Both approaches have their strengths and weaknesses obviously.

Where are 3PAR’s SPC-2 results?

This is a question I asked them (again). 3PAR has never published SPC-2 results. They love to tout their SPC-1, but SPC-2 is not there……. I got a positive answer though: Stay tuned.  So I have to assume something is coming.. at some point. They aren’t outright disregarding the validity of the test.

In the past 3PAR systems have been somewhat bandwidth constrained due to their use of PCI-X. Though the latest generation of stuff (7xxx/10xxx) all leverage PCIe.

The 7450 tops out at 5.2 Gigabytes/second of throughput, a number which they say takes into account overhead of a distributed volume system (it otherwise might be advertised as 6.4 GB/sec as a 2-node system does 3.2GB/sec). Given they admit the overhead to a distributed system now, I wonder how, or if, that throws off their previous throughput metrics of their past arrays.

I have a slide here from a few years ago that shows a 8-controller T800 supporting up to 6.4GB/sec of throughput, and a T400 having 3.2GB/sec (both of these systems were released in Q3 of 2008). Obviously the newer 10400 and 10800 go higher(don’t recall off the top of my head how much higher).

This compares to published SPC-2 numbers from IBM XIV at more than 7GB/sec, as well as HP P9500/HDS VSP at just over 13GB/sec.

3PAR 7450

Announced more than a month ago now, the 7450 is of course the purpose built flash platform which is, at the moment all SSD.

Can it run with spinning rust?

One of the questions I had, was I noticed that the 7450 is currently only available in a SSD-only configuration. No spinning rust is supported. I asked why this was and the answer was pretty much what I expected. Basically they were getting a lot of flak for not having something that was purpose built. So at least in the short term, the decision not to support spinning rust is purely a marketing one. The hardware and software is the same(other than being more beefy in CPU & RAM) than the other 3PAR platforms. The software is identical as well. They just didn’t want to give people more excuses to label the 3PAR architecture as something that wasn’t fully flash ready.

It is unfortunate that the market has compelled HP to do this, as other workloads would still stand to gain a lot especially with the doubling up of data cache on the platform.

Still CPU constrained

One of the questions asked by someone was about whether or not the ASIC is the bottleneck in the 7450 I/O results. The answer was a resounding NO – the CPU is still the bottleneck even at max throughput. So I followed up with why did HP choose to go with 8 core CPUs instead of 10-core which Intel of course has had for some time. You know how I like more cores! The answer was two fold to this. The primary reason was cooling(the enclosure as is has two sockets, two ASICs, two PCIe slots, 24 SSDs, 64GB of cache and a pair of PSUs in 2U). The second answer was the system is technically Ivy-bridge capable but they didn’t want to wait around for those chips to launch before releasing the system.

They covered a bit about the competition being CPU limited as well especially with data services, and the amount of I/O per CPU cycle is much lower on competing systems vs 3PAR and the ASIC.  The argument is an interesting one though at the end of the day the easy way to address that problem is throw more CPUs at it, they are fairly cheap after all. The 7000-series is really dense so I can understand the lack of ability to support a pair of dual socket systems within a 2U enclosure along with everything else. The 10400/10800 are dual socket(though older generation of processors).

TANGENT TIME

I really have not much cared for Intel’s code names for their recent generation of chips. I don’t follow CPU stuff all that closely these days(haven’t for a while), but I have to say it’s mighty easy to confuse code name A from B, which is newer? I have to look it up. every. single. time.

I believe in the AMD world (AMD seems to have given up on the high end, sadly), while they have code names, they have numbers as well. I know 6200 is newer than 6100 ..6300 is newer than 6200..it’s pretty clear and obvious. I believe this goes back to Intel and them not being able to trademark the 486.

On the same note, I hate Intel continuing to re-use the code word i7 in laptops. I have an Core i7 laptop from 3 years ago, and guess what the top end today still seems to be? I think it’s i7 still. Confusing. again.
</ END TANGENT >

Effortless SSD management of each SSD with proactive alerts

I wanted to get this in before going deeper into the cache optimizations since that is a huge topic. But the basic gist of this is they have good monitoring of the wear of the SSDs in the platform(something I think that was available on Lefthand a year or two ago), in addition to that the service processor (dedicated on site appliance that monitors the array) will alert the customer when the SSD is 90% worn out. When the SSD gets to 95% then the system pro-actively fails the drive and migrates data off of it(I believe). They raised a statistic that was brought up at Discover that something along the lines of 95% of all deployed SSDs in 3PAR were still in the field – very few have worn out. I don’t recall anyone mentioning the # of SSDs that have been deployed on 3PAR but it’s not an insignificant number.

SSD Caching Improvements in 3PAR OS 3.1.2

There have been a number of non trivial caching optimizations in the 3PAR OS to maximize performance as well as life span of SSDs. Some of these optimizations also benefit spinning rust configurations as well – I have personally seen a noticeable drop in latency in back end disk response time since I upgraded to 3.1.2 back in May(it was originally released in December), along with I believe better response times under heavy load on the front end.

Bad version numbers

I really dislike 3PAR’s version numbering, they have their reasons for doing what they do, but I still think it is a really bad customer experience. For example going from 2.2.4 to 2.3.1 back in what was it 2009 or 2010. The version number implies minor update, but this was a MASSIVE upgrade.  Going from 2.3.x to 3.1.1 was a pretty major upgrade too (as the version implied).  3.1.1 to 3.1.2 was also a pretty major upgrade. On the same note the 3.1.2 MU2 (patch level!) upgrade that was released last month was also a major upgrade.

I’m hoping they can fix this in the future, I don’t think enough effort is made to communicate major vs minor releases. The version numbers too often imply minor upgrades when in fact they are major releases. For something as critical as a storage system I think this point is really important.

Adaptive Read Caching

Adaptive Read Caching

3PAR Adaptive Read Caching for SSD (the extra bits being read there from the back end are to support the T10 Data Integrity Feature- available standard on all Gen4 ASIC 3PAR systems, and a capability 3PAR believes is unique in the all flash space for them)

One of the things they covered with regards to caching with SSD is the read cache is really not as effective(vs with spinning rust), because the back end media is so fast, there is significantly less need for caching reads. So in general, significantly more cache is used with writes.

For spinning rust 3PAR reads a full 16kB of data from the back end disk regardless of the size of the read on the front end (e.g. 4kB). This is because the operation to go to disk is so expensive already and there is no added penalty to grab the other 12kB while your grabbing the 4kB you need. The next I/O request might request part of that 12kB and you can save yourself a second trip to the disk when doing this.

With flash things are different. Because the media is so fast, you are much more likely to become bandwidth constrained rather than IOPS constrained.  So if for example you have that 500,000 4k read IOPS on the front end, and your performing those same 16kB read IOPS on the back end, that is, well 4x more bandwidth that is required to perform those operations. Again because the flash is so fast, there is significantly less penalty to go back to SSD again and again to retrieve those smaller blocks. It also improves latency of the system.

So in short, read more from disks because you can and there is no penalty, read only what you need from SSDs because you should and there is (almost) no penalty.

Adaptive Write Caching

Adaptive Write Caching

Adaptive Write Caching

With writes the situation is similar to reads, to maximize SSD life span, and minimize latency you want to minimize the number of write operations to the SSD whenever possible.

With spinning rust again 3PAR works with 16kB pages, if a 4kB write comes in then the full 16kB is written to disk, again  because there is no additional penalty for writing the 16kB vs writing 4kB. Unlike SSDs your not likely bandwidth constrained when it comes to disks.

With SSDs, the optimizations they perform, again to maximize performance and reduce wear, is if a 4kB write comes in, a 16kB write occurs to the cache, but only the 4kB of changed data is committed to the back end.

If I recall right they mentioned this operation benefits RAID 1 (anything RAID 1 in 3PAR is RAID 10, same for RAID 5 – it’s RAID 50) significantly more than it benefits RAID 5/6, but it still benefits RAID 5/6.
 

Autonomic Cache offload

Autonomic Cache Offload

Autonomic Cache Offload

Here the system changes the frequency at which it flushes cache to back end media based on utilization. I think this plays a lot into the next optimization.

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Multi Tenant I/O Processing

3PAR has long been about multi tenancy of their systems. The architecture lends itself well to running in this mode though it wasn’t perfect, I believe for the most part the addition of Priority Optimization that was announced late last year and finally released last month fills the majority of the remainder of that hole. I have run “multi tenant” 3PAR systems since the beginning. Now to be totally honest the tenants were all me, just different competing workloads, whether it is disparate production workloads or a mixture of production and non production(and yes in all cases they ran on the same spindles). It wasn’t nearly as unpredictable as say a service provider with many clients running totally different things, that would sort of scare me on any platform. But there was still many times where rogue things (especially horrible SQL queries) overran the system (especially write cache). 3PAR handles it as well, if not better than anyone else but every system has it’s limits.

Front end operations

The caching flushing process to back end media is now multi threaded. This benefits both SSD as well as existing spinning rust configurations. Significantly less(no?) locking involved when flushing cache to disk.

Here is a graph from my main 3PAR array, you can see the obvious latency drop from the back end spindles once 3.1.2 was installed back in May (again the point of this change was not to impact back end disk latency as much as it was to improve front end latency, but there is a significant positive behavior change post upgrade):

Latency Change on back end spinning rust with 3.1.2
Latency Change on back end spinning rust with 3.1.2

There was a brief time when latency actually went UP on the back end disks. I was concerned at first but later determined this was the disk defragmentation processes running(again with improved algorithms), before the upgrade they took FAR too long, post upgrade they completed a big backlog in a few days and latency returned to low levels.

Multi Tenant Caching

Multi Tenant Caching

Back end operations

On the topic of multi tenant with SSDs an interesting point was raised which I had never heard of before. They even called it out as being a problem specific to SSDs, and does not exist with spinning rust. Basically the issue is if you have two workloads going to the same set of SSDs, one of them issuing large I/O requests(e.g. sequential workload), and the other issuing small I/O requests(e.g. 4kB random read), the smaller I/O requests will often get stuck behind the larger ones causing increases in latency to the app using smaller I/O requests.

To address this, the 128kB I/Os are divided up into four 32kB I/O requests and sent in parallel to the other workload. I suppose I can get clarification but I assume for a sequential read operation with 128kB I/O request there must not be any additional penalty for grabbing the 32kB, vs splitting it up even further into even more smaller I/Os.
 
 
 

Maintaining performance during media failures

3PAR has always done wide striping, and sub disk distributed RAID so the rebuild times are faster, the latency is lower and all around things run better(no idle hot spares) that way vs the legacy designs of the competition. The system again takes additional steps now to maximize SSD life span by optimizing the data reads and writes under a failure condition.

HP points out that SSDs are poor at large sequential writes, so as mentioned above they divide the 128kB writes that would be issued during a rebuild operation (since that is largely a sequential operation) into 32kB I/Os again to protect those smaller I/Os from getting stuck behind big I/Os.

They also mentioned that during one of the SPC-1 tests (not sure if it was 7400 or 7450) one of the SSDs failed and the system rebuilt itself. They said there was no significant performance hit(as one might expect given experience with the system) as the test ran. I’m sure there was SOME kind of hit especially if you drive the system to 100% of capacity and suffer a failure. But they were pleased with the results regardless. The competition would be lucky to have something similar.

What 3PAR is not doing

When it comes to SSDs and caching something 3PAR is not doing, is leveraging SSDs to optimize back end I/Os to other media as sequential operations. Some storage startups are doing this to gain further performance out of spinning rust while retaining high random performance using SSD. 3PAR doesn’t do this and I haven’t heard of any plans to go this route.

Conclusion

I continue to be quite excited about the future of 3PAR, even more so pre acquisition. HP has been able to execute wonderfully on the technology side of things. Sales from all accounts at least on the 7000 series are still quite brisk. Time will tell if things hold up after EVA is completely off the map, but I think they are doing many of the right things. I know even more of course but can’t talk about it here(yet)!!!

That’s it for tonight, at ~4,000 (that number keeps going up, I should goto bed) words this took three hours or more to write+proof read, it’s also past 2AM. There is more to cover, the 3PAR stuff was obviously what I was most interested in. I have a few notes from the other sessions but they will pale in comparison to this.

Today I had a pretty good idea on how HP could improve it’s messaging around whether to choose 3PAR or StoreVirtual for a particular workload. The messaging to-date to me has been very confusing and conflicting (HP tried to drive home a point about single platforms and reducing complexity, something this dual message seems to conflict with).  I have been communicating with HP off and on for the past few months, and today out of the blue I  came up with this idea which I think will help clear the air. I’ll touch on this soon when I cover the other areas that were talked about today.

Tomorrow seems to be a busy day, apparently we have front row seats, and the only folks with power feeds. I won’t be “live blogging”(as some folks tend to love to do), I’ll leave that to others. I work better at spending some time to gather thoughts and writing something significantly longer.

If you are new to this site you may want to check out a couple of these other articles I have written about 3PAR(among the dozens…)

Thanks for reading!

July 29, 2013

HP Storage Tech Day Live

Filed under: Storage — Tags: , , — Nate @ 12:38 am

In about seven and a half hours the HP tech day event will be starting, I thought it was going to be a private event but it looks like they will be broadcasting it via one of the folks here is adept at that sort of thing.

If your interested, the info is here. It starts at 8AM Pacific. Fortunately it’s literally downstairs from my hotel room.

Topics include

  • 3par deep dive (~3 hrs worth)
  • StoreVirtual  (Lefthand VSA), StoreOnce VSA, and Open stack integration
  • StoreAll (IBRIX), express query(Autonomy)

No Vertica stuff.. Vertica doesn’t get any storage people excited since it is so fast and reduces I/O by so much.. so you don’t need really fancy storage stuff to make it fly.

HP asked that I put this notice on these posts so the FCC doesn’t come after us..

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

(Many folks have accused me of being compensated by 3PAR and/or HP in the past based on what I have written here but I never have been – by no means do I love HP as a whole there are really only a few product lines that have me interested which is 3PAR, Proliant, and Vertica – ironically enough each of those came to HP via acquisition). I have some interest in StoreOnce though have yet to use it. (Rust in Peace WebOS — I think I will be on Android before the end of the year – more on that later..)

I’m pretty excited about tomorrow (well today given that it’s almost 1AM), though getting up so early is going to be a challenge!

Apparently I’m the only person in the group here that is not on twitter. I don’t see that changing anytime soon. Twitter and Facebook are like the latest generation of Star Trek movies, they basically don’t exist to me.

The one thing that I am sort of curious about is what, if any plans HP has for the P9500 range, they don’t talk about it much.. I’m sure they won’t come out and say they are retiring it any time soon, since it’s still fairly popular with select customers. I just want to try to get them to say something about it, I am curious.

(this is my first trip to a vendor-supported event that included travel)

Older Posts »

Powered by WordPress