TechOpsGuys.com Diggin' technology every day

2Aug/13Off

HP Storage Tech Day – bits and pieces

TechOps Guy: Nate

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

For my last post on HP Storage tech day, the remaining topics that were only briefly covered at the event.

HP Converged Storage Management

There wasn't much here other than a promise to build YASMT (Yet another storage management tool), this time it will be really good though. HP sniped at EMC on several occasions for the vapor-ness of ViPR. Though at least that is an announced product with a name. HP has a vision, no finalized name, no product(I'm sure they have something internally) and no dates.

I suppose if your in the Software defined storage camp which is for the separation of data and control plane, this may be HP's answer to that.

HP Converged Storage Management Strategy

HP Converged Storage Management Strategy

The vision sounds good as always, time will tell if they can pull it off. The track record for products like this is not good. More often than not the tools lower the bar on what is supported to some really basic set of things, and are not able to exploit the more advanced features of the platform under management.

One question I did ask is whether or not they were going to re-write their tools to leverage these new common APIs, and the answer was sort of what I expected - they aren't. At least short term the tools will use a combination of these new APIs as well as whatever methods they use today. So this implies that only a subset of functionality will be available via the APIs.

In contrast I recall reading something, perhaps a blog post, about how NetApp's tools use all of their common APIs(I believe end to end API stuff for them is fairly recent). HP may take a couple of years to get to something like that.

HP Openstack Integration

HP is all about the Openstack. They seem to be living and breathing it. This is pretty neat, I think Openstack is a good movement, though the platform still seems some significant work to mature.

I have concerns, short term concerns about HP's marketing around Openstack and how easy it is to integrate into customer environments. Specifically Openstack is a fast moving target, lacks maturity and at least as recently as earlier this year lacked a decent community of IT users (most of it was centered on developers - probably still is). HP's response is they are participating deeply within the community (which is good long term), and are being open about everything (also good).

I specifically asked if HP was working with Red Hat to make sure the latest HP contributions (such as 3PAR support, Fibre Channel support) were included in the RH Open Stack. They said no, they are working with the community, and not partners. This is of course good and bad. Good that they are being open, bad in that it may result in some users not getting things for 12-24 months because the distribution of Openstack they chose is too old to support it.

I just hope that Openstack matures enough that it gets a stable set of interfaces. Unlike say the Linux kernel driver interfaces which just annoy the hell out of me(have written about that before). Compatibility people!!!

Openstack Fibre Channel support based on 3PAR

HP wanted to point out that the Fibre Channel support in Openstack was based on 3PAR. It is a generic interface and there are plugins for a few different array types. 3PAR also has iSCSI support for Openstack as of a recent 3PAR software release as well.

StoreVirtual was first Openstack storage platform

Another interesting tidbit is that Storevirtual was the first(?) storage platform to support Openstack. Rackspace used it(maybe still does, not sure), and contributed some stuff to make it better. HP also uses it in their own public cloud(not sure if they mentioned this or not but I heard from a friend who used to work in that group).

HP Storage with Openstack

Today HP integrates with Openstack at the block level on both the StoreVirtual and 3PAR platforms. Work is in progress for StoreAll which will provide file and object storage. Fibre channel support is available on the 3PAR platform only as far as HP goes. StoreVirtual supports Fibre Channel but not with Openstack(yet anyway, I assume support is coming).

This contrasts with the competition, most of whom have no Openstack support and haven't announced anything to be released anytime soon. HP certainly has a decent lead here, which is nice.

HP Openstack iSCSI/FC driver functionality

All of HP's storage work with Openstack is based on the Grizzly release which came out around April 2013.

  • Create / Delete / Attach / Detach volumes
  • Create / Delete Snapshots
  • Create volume from snapshot
  • Create cloned volumes
  • Copy image to volume / Copy volume to image (3PAR iSCSI only)

New things coming in Havana release of Openstack from HP Storage

  • Better session management within the HP 3PAR StoreServ Block Storage Drivers
  • Re-use of existing HP 3PAR Host entries
  • Support multiple 3PAR Target ports in HP 3PAR StoreServ Block Storage iSCSI Driver
  • Support Copy Volume To Image & Copy Image To Volume with Fibre Channel Drivers (Brick)
  • Support Quality of Service (QoS) setting in the HP 3PAR StoreServ Block Storage Drivers
  • Support Volume Sets with predefined QoS settings
  • Update the hp3parclient that is part of the Python Standard Library

Fibre channel enhancements for Havana and beyond

Fibre Channel enhancements for Openstack Havana and beyond

Fibre Channel enhancements for Openstack Havana and beyond

Openstack portability

This was not at the Storage Tech Day - but I was at a break out session that talked about HP and Openstack at the conference and one of the key points they hit on was the portable nature of the platform, run it in house, run it in cloud system, run it at service providers and move your workloads between them with the same APIs.

Setting aside a moment the fact that the concept of cloud bursting is a fantasy for 99% of organizations out there(your applications have to be able to cope with it, your not likely going to be able to scale your web farm and burst into a public cloud when those web servers have to hit databases that reside over a WAN connection the latency hit will make for a terrible experience).

Anyway setting that concept aside - you still have a serious problem- short term of compatibility across different Openstack implementations because different vendors are choosing different points to base their systems off of. This is obviously due to the fast moving nature of the platform and when the vendor decides to launch their project.

This should stabilize over time, but I felt the marketing message on this was a nice vision, it just didn't represent any reality I am aware of today.

HP contrasted this to being locked in to say the vCloud API. I think there are more clouds public and private using vCloud than Openstack at this point. But in any case I believe use cases for the common IT organization to be able to transparently leverage these APIs to burst on any platform- VMware, Openstack, whatever - is still years away from reality.

If you use Openstack's API, you're locked into their API anyway. I don't leverage APIs myself(directly) I am not a developer - so I am not sure how easy it is to move between them. I think the APIs are likely much less of a barrier than the feature set of the underlying cloud in question. Who cares if the API can do X and Y, if the provider's underlying infrastructure doesn't yet support that capability.

One use case that could be done today, that HP cited, is running development in a public cloud then pulling that back in house via the APIs. Still that one is not useful either. The amount of work involved in rebuilding such an environment internally should be fairly trivial anyway(the bulk of the work should be in the system configuration area, if your using cloud you should also be using some sort of system management tool, whether it is something like CFEngine, Puppet, Chef, or something similar). That and - this is important in my experience - development environments tend to not be resource intensive. Which makes it great to consolidate them on internal resources (even ones that share with production - I have been doing this since for six years already).

My view on Openstack

At least one person at HP I spoke with believes most stuff will be there by the end of this year but I don't buy that for a second. I look at things like Red Hat's own Openstack distribution taking seemingly forever to come out(I believe it's a few months behind already and I have not seen recent updates on it), and Rackspace abandoning their promise to support 3rd party Open stack clouds earlier this year.

All of what I say is based on what I read -- I have no personal experience with Openstack (nor do I plan to get immediate experience, the lack of maturity of the product is keeping me away for now). Based on what I have read, conferences(was at a local Red Hat conference last December where they covered Openstack - that's when reality really hit me and I learned a good deal about it and honestly lost some enthusiasm in the project) and some folks I have chatted/emailed with Openstack is still a VERY much work in progress, evolving quickly. There's really no formal support community in place for a stable product, developers are wanting to stay at the leading edge and that's what they are willing to support. Red Hat is off in one corner trying to stabilize the Folsum release from last year to make a product out of it, HP is in another corner contributing code to the latest versions of Openstack that may or may not be backwards compatible with Red Hat or other implementations.

It's a mess.. it's a young project still so it's sort of to be expected. Though there are a lot of folks making noise about it. The sense I get is if you are serious about running an Open Stack cloud today, as in right now, you best have some decent developers in house to help manage and maintain it. When Red Hat comes out with their product, it may solve a bunch of those issues, but still it'll be a "1.0", and there's always some not insignificant risk to investing in that without a very solid support structure inside your organization (Red Hat will of course provide support but I believe that won't be enough for most).

That being said it sounds like Openstack has a decent future ahead of it - with such large numbers of industry players adopting support for it, it's really only a matter of time before it matures and becomes a solid platform for the common IT organization to be able to deploy.

How much time? I'm not sure. My best guesstimate is I hope it can reach that goal within five years. Red Hat, and others should be on versions 3 and perhaps 4 by then. I could see someone such as myself starting to seriously dabble in it in the next 12-16 months.

Understand that I'm setting the bar pretty high here.

Last thoughts on HP Storage Tech Day

I had a good time, and thought it was a great experience. They had very high caliber speakers, were well organized and the venue was awesome as well. I was able to drill them pretty good, the other bloggers seemed to really appreciate that I was able to drive some of the technical conversations. I'm sure some of my questions they would of rather not of answered since the answers weren't always "yes we've been doing that forever..!", but they were honest and up front about everything. When they could not be, they said so("can't talk about that here we need a Nate Disclosure Agreement").

I haven't dealt much at all with the other internal groups at HP, but I can say the folks I have dealt with on the storage side have all been AWESOME. Regardless of what I think about whatever storage product they are involved with they are all wonderful people both personally and professionally.

These past few posts have been entirely about what happened on Monday.  There are more bits that happened at the main conference on Tues-Thur, and once I get the slides for those slide decks I'll be writing more about that, there were some pretty cool speakers. I normally steer far clear of such events, this one was pretty amazing though. I'll save the details for the next posts.

I want to thank the team at HP, and Ivy Worldwide for organizing/sponsoring this event - it was a day of nothing but storage (and we literally ran out of time at the end, one or two topics had to be skipped). It was pretty cool. This is the first event I've ever traveled for, and the only event where there was some level of sponsorship (as mentioned HP covered travel, lodging and food costs).

31Jul/13Off

HP Storage Tech Day – StoreAll, StoreVirtual, StoreOnce

TechOps Guy: Nate

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

On Monday I attended a private little HP Storage Tech Day event here in Anaheim for a bunch of bloggers. They also streamed it live, and I believe the video is available for download as well.

I wrote a sizable piece on the 3PAR topics which were covered in the morning, here I will try to touch on the other HP Storage topics.

HP StoreAll + Express Query

StoreAll Platform

HP doesn't seem to talk about this very much, and as time goes on I have understood why. It's not a general purpose storage system, I suppose it never has been (though I expect Ibrix tried to make it one in their days of being independent). They aren't going after NetApp or EMC's enterprise NAS offerings. It's not a platform you want to run VMs on top of. Not a platform you want to run databases on top of. It may not even be a platform you want to run HPC stuff on top of. It's built for large scale bulk file and object storage.

They have tested scalability to 1,024 nodes and 16PB within a single name space. The system can scale higher, that's just the tested configuration. They say it can scale non disruptively and re-distribute existing data across new resources as those resources are added to the system. StoreAll can go ultra dense with near line SAS drives going up to roughly 500 drives in a rack (without any NAS controllers).

It's also available in a gateway version which can go in front of 3PAR, EVA and XP storage.

They say their main competition is EMC Isilon, which is another scale-out offering.

HP seems to have no interest in publishing performance metrics, including participating in SPECsfs (a benchmark that sorely lacks disclosures). The system has no SSD support at all.

The object store and file store, if I am remembering right, are not shared. So you have to access your data via a consistent means. You can't have an application upload data to an object store then give a user the ability to browse to that file using CIFS or NFS. To me this would be an important function to serve if your object and file stores are in fact on the same platform.

By contrast, I would expect a scale out object store to do away with the concept of disk-based RAID and go with object level replication instead. Many large scale object stores do this already. I believe I even read in El Reg that HP Labs is working on something similar(nothing around that was mentioned at the event). In StoreAll's case they are storing your objects in RAID, but denying you the flexibility to access them over file protocols which is unfortunate.

From a competitive standpoint, I am not sure what features HP may offer that are unique from an object storage perspective that would encourage a customer to adopt StoreAll for that purpose. If it were me I would probably take a good hard look at something like Red Hat Storage server(I would only consider RHSS for object storage, nothing else) or other object offerings if I was to build a large scale object store.

Express Query (below) cannot currently run on object stores at this time, it will with a future release though.

Express Query

This was announced a while back, which is what seems to be a SQL database of sorts that is running on the storage controllers, with some hooks into the file system itself. It provides indexes of common attributes as well as gives the user the ability to provide custom attributes to search by. As a result, obviously you don't have to search the entire file system to find files that match these criteria.

It is exposed as a Restful API which has it's ups and downs. As an application developer you can take this and tie it into your application. It returns results in JSON format (developer friendly, hostile to users such as myself - remember my motto "if it's not friends with grep or sed, it's not friends with me").

The concept is good, perhaps the implementation could use some more work, as-is it seems very developer oriented. There is a java GUI app which you can run that can help you build and submit a query to the system which is alright. I would like to see a couple more things:

  • A website on the storage system (with some level of authentication - you may want to leave some file results open to the "public" if those are public shares) that allows users to quickly build a query using a somewhat familiar UI approach.
  • A drop in equivalent to the Linux command find. It would only support a subset of functionality but you could get a large portion of that functionality I believe fairly simply with this. The main point being don't make the users have to make significant alterations to their processes to adopt this feature, it's not complicated, lower the bar for adoption.

To HP's credit they have written some sort of plugin to the Windows search application that gives windows users the ability to easily use Express Query(I did not see this in action). Nothing so transparent exists for Linux though.

My main questions though were things HP was unable to answer. I expected more from HP on this front in general. I mean specifically around competitive info. In many cases they seem to rely on the publicly available information on the competition's platforms - maybe limited to the data that is available on the vendor website. HP is a massive organization with very deep pockets - you may know where I'm going here.  GO BUY THE PRODUCTS! Play with them, learn how they work, test them, benchmark them. Get some real world results out of the systems. You have the resources, you have the $$. I can understand back when 3PAR was a startup company they may not be able to go around buying arrays from the competition to put them through their paces. Big 'ol HP should be doing that on a regular basis. Maybe they are -- if they are -- they don't seem to be admitting that their data is based on that(in fact I've seen a few times where they explicitly say the information is only from data sheets etc).

Another approach might be, if HP lacks the man power to run such tests and stuff, to have partners or customers do it for them. Offer to subsidize the cost of some purchase by some customer of some competitive equipment in exchange for complete open access to competitive information as a result of using such a system. Or fully cover the cost.. HP has the money to make it happen. It would be really nice to see..

So in regards to Express Query I had two main questions about performance related to the competition. HP says they view Isilon as the main competition for StoreAll.  A couple of years back Isilon started offering a product(maybe it is standard across the board now I am not sure) where they stored the metadata in SSD. This would dramatically accelerate these sorts of operations, without forcing the user to change their way of doing things. Lowers the bar of adoption. Now price wise it probably costs more, StoreAll does not have any special SSD support whatsoever. But I would be curious as to the performance in general comparing Isilon's accelerated metadata vs HP Express query.  Obviously Express Query is more flexible with it's custom meta data fields and tagging etc, so for specific use cases there is no comparison. BUT.. for many things I think both would work well..

Along the same notes - back when I was a BlueArc customer one of their big claims was their FPGA accelerated file system had lightning fast meta data queries. So how does Express Query performance compare to something like that? I don't know, and got no answer from HP.

Overall

Overall, I'd love it if HP had a more solid product in this area, it feels like whoever I talk to that they have just given up trying to be competitive with an in house enterprise file offering(they do have a file offering based on Windows storage server but I don't really consider that in the same ballpark since they are just re-packaging someone else's product). HP StoreAll has it's use cases and it probably does those fairly well, but it's not a general purpose file platform, and from the looks of things it's never going to be one.

Software Defined Storage

Just hearing the phrase Software Defined <anything> makes me shudder. Our friends over at El Reg have started using the term hype gasm when referring to Software Defined NetworkingI believe the SDS space is going to be even worse, at least for a while.

(On a side note there was an excellent presentation on SDN at the conference today which I will write about once I have the password to unlock the presentations so I can refresh my memory on what was on the slides - I may not get the password until Friday though)

As of today, the term is not defined at all. Everyone has their own take on it, and that pisses me off as a technical person. From a personal standpoint I am in the camp leaning more towards some separation of data and control planes ala SDN, but time will tell what it ends up being.

I think Software Defined Storage, if it is not some separation of control and data plan instead could just be a subsystem of sorts that provides storage services to anything that needs them. In my opinion it doesn't matter if it's from a hardware platform such as 3PAR, or a software platform such as a VSA. The point is you have a pool of storage which can be provisioned in a highly dynamic & highly available manor to whatever requests it. At some point you have to buy hardware - having infrastructure that is purpose built, and shared is obviously a commonly used strategy today. The level of automation and API type stuff varies widely of course.

The point here is I don't believe the Software side of things means it has to be totally agnostic as to where it runs - it just needs a standard interfaces in which anything can address it(APIs, storage protocols etc). It's certainly nice to have the ability to run such a storage system entirely as a VM, there are use cases for that for certain. But I don't want to limit things to just that. So perhaps more focus on the experience the end user gets rather than how you get there. Something like that.

StoreVirtual

HP's take on it is of course basically storage resources provisioned through VSAs. Key points being:

  • Software only (they also offer a hardware+software integrated package so...)
  • Hypervisor agnostic (right now that includes VMware and Hyper-V so not fully agnostic!)
  • Federated

I have been talking with HP off and on for months now about how confusing I find their messaging around this.

In one corner we have:

3PAR Eliminating boundaries.

3PAR Eliminating boundaries.

In the other corner we have

Software Defined Data Center - Storage

Software Defined Data Center - Storage (3PAR is implied to be Service Refined Storage - Storevirtual is Cost Optimized)

Store Virtual key features

Store Virtual key features

Mixed messages

(thinking from a less technical user's perspective - specifically thinking back to some of my past managers who thought they knew what they were doing when they really didn't - I'm looking out for other folks like me in the field who don't want their bosses to get the wrong message when they see something like this)

What's the difference between 3PAR's SLA Optimized storage when value matters, and StoreVirtual Cost Optimized?

Hardware agnostic and federated sounds pretty cool, why do I need 3PAR when I can just scale out with StoreVirtual? Who needs fancy 3PAR Peer Persistence (fancy name for transparent full array high availability) when I have built in disaster recovery on the StoreVirtual platform?

Expensive 3PAR software licensing? StoreVirtual is all inclusive! The software is the same right? I can thin provision, I can snapshot, I can replicate, I can peer motion between StoreVirtual systems. I have disaster recovery, I have iSCSI, I have Fibre channel. I have scale out and I have a fancy shared-nothing design. I have Openstack integration. I have flash support, I have tiering, I have all of this with StoreVirtual. Why did HP buy 3PAR when they already have everything they need for the next generation of storage?

(stepping back into technical role now)

Don't get me wrong - I do see some value in the StoreVirtual platform! It's really flexible, easy to deploy, and can do wonders to lower costs in certain circumstances - especially branch office type stuff. If you can deploy 2-3 VM servers at an edge office and leverage HA shared storage without a dedicated array I think that is awesome.

But the message for data center and cloud deployments - do I run StoreVirtual as my primary platform or run 3PAR ?  The messaging is highly confusing.

My idea to HP on StoreVirtual vs. 3PAR

I went back and forth with HP on this and finally, out of nowhere I had a good idea which I gave to them and it sounds like they are going to do something with it.

So my idea was this - give the customer a set of questions, and based on the answers of those questions HP would know which storage system to recommend for that use case. Pretty simple idea. I don't know why I didn't come up with it earlier (perhaps because it's not my job!). But it would go a long way in cleaning up that messaging around which platform to use. Perhaps HP could take the concept even further and update the marketing info to include such scenarios (I don't know how that might be depicted, assuming it can be done so in a legible way).

When I gave that idea, several people in the room liked it immediately, so that felt good :)

 HP StoreOnce

(This segment of the market I am not well versed in at all, so my error rate is likely to go up by a few notches)

HP StoreOnce is of course their disk-based scale-out dedupe system developed by HP Labs. One fairly exciting offering in this area that was recently announced at HP Discover is the StoreOnce VSA. Really nice to see the ability to run StoreOnce as a VM image for small sites.

They spent a bunch of time talking about how flexible the VSA is, how you can vMotion it and Storage vMotion it like it was something special. It's a VM, it's obvious you should be able to do those things without a second thought.

StoreOnce is claimed to have a fairly unique capability of being able to replicate between systems without ever re-hydrating the data. They also claim a unique ability to be the first platform to offer real high availability. In a keynote session by David Scott (which I will cover in more depth in a future post once I get copies of those presentations) he mentioned that Data Domain as an example, if a DD controller fails during a backup or a restore operation the operation fails and must be restarted after the controller is repaired.

This surprised me - what's the purpose of dual controllers if not to provide some level of high availability? Again forgive my ignorance on the topic as this is not an area of the storage industry I have spent much time at all in.

HP StoreOnce however can recover from this without significant downtime. I believe the backup or restore job still must be restarted from scratch, but you don't have to wait for the controller to be repaired to continue with work.

HP has claimed for at least the past year to 18 months now that their performance far surpasses everyone else by a large margin, they continued those claims this week. I believe I read at one point from their competition that the claims were not honest in that I believe the performance claims was from a clustered StoreOnce system which has multiple de-dupe domains(meaning no global dedupe on the system as a whole), and it was more like testing multiple systems in parallel against a single data domain system(with global dedupe). I think there were some other caveats as well but I don't recall specifics (this is from stuff I want to say I read 18-20 months ago).

In any case, the product offering seems pretty decent, is experiencing a good amount of growth and looks to be a solid offering in the space. Much more competitive in the space than StoreAll is, probably not quite as competitive as 3PAR, perhaps a fairly close 2nd as far as strength of product offering.

Next up, converged storage management and Openstack with HP. Both of these topics are very light relative to the rest of the material, I am going to go back to the show room floor to see if I can dig up more info.

 

30Jul/13Off

HP Storage Tech Day – 3PAR

TechOps Guy: Nate

Before I forget again..

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

So, HP hammered us with a good seven to eight hours of storage related stuff today, the bulk of the morning was devoted to 3PAR and the afternoon covered StoreVirtual, StoreOnce, StoreAll, converged management and some really technical bits from HP Labs.

This post is all about 3PAR. They covered other topics of course but this one took so long to write I had to call it a night, will touch on the other topics soon.

I won't cover everything since I have covered a bunch of this in the past. I'll try not to be too repetitive...

I tried to ask as many questions as I could, they answered most .. the rest I'll likely get with another on site visit to 3PAR HQ after I sign another one of those Nate Disclosure Agreements (means I can't tell you unless your name is Nate). I always feel guilty about asking questions directly to the big cheeses at 3PAR. I don't want to take up any of their valuable time...

There wasn't anything new announced today of course, so none of this information is new, though some of is new to this blog, anyway!

I suppose if there is one major take away for me for this SSD deep dive, is the continued insight into how complex storage really is, and how well 3PAR does at masking that complexity and extracting the most of everything out of the underlying hardware.

Back when I first started on 3PAR in late 2006, I really had no idea what real storage was. As far as I was concerned one dual controller system with 15K disks was the same as the next. Storage was never my focus in my early career (I did dabble in a tiny bit of EMC Clariion (CX6/700) operations work - though when I saw the spreadsheets and visios the main folks used to plan and manage I decided I didn't want to get into storage), it was more servers, networking etc.

I learned a lot in the first few years of using 3PAR, and to a certain extent you could say I grew up on it. As far as I am concerned being able to wide stripe, or have mesh active controllers is all I've ever (really) known. Sure since then I have used a few other sorts of systems. When I see architectures and processes of doing things on other platforms I am often sort of dumbfounded why they do things that way. It's sometimes not obvious to me that storage used to be really in the dark ages many years ago.

Case in point below, there's a lot more to (efficient, reliable, scalable, predictable) SSDs than just tossing a bunch of SSDs into a system and throwing a workload at them..

I've never tried to proclaim I am a storage expert here(or anywhere) though I do feel I am pretty adept at 3PAR stuff at least, which wasn't a half bad platform to land on early on in the grand scheme of things. I had no idea where it would take me over the years since. Anyway, enough about the past....

New to 3PAR

Still the focus of the majority of HP storage related action these days, they had a lot to talk about. All of this initial stuff isn't there yet(up until the 7450 stuff below), just what they are planning for at some point in the future(no time frames on anything that I recall hearing).

Asynchronous Streaming Replication

Just a passive mention of this on a slide, nothing in depth to report about, but I believe the basic concept is instead of having asynchronous replication running on snapshots that kick off every few minutes (perhaps every five minutes) the replication process would run much more frequently (but not synchronous still), perhaps as frequent as every 30 seconds or something.

I've never used 3PAR replication myself. Never needed array based replication really. I have built my systems in ways that don't require array based replication. In part because I believe it makes life easier(I don't build them specifically to avoid array replication it's merely a side effect), and of course the license costs associated with 3PAR replication are not trivial in many circumstances(especially if your only needing to replicate a small percentage of the data on the system). The main place where I could see leveraging array based replication is if I was replicating a large number of files, doing this at the block layer is often times far more efficient(and much faster) than trying to determine changed bits from a file system perspective.

I wrote/built a distributed file transfer architecture/system for another company a few years ago that involved many off the shelf components(highly customized) that was responsible for replicating several TB of data a day between WAN sites, it was an interesting project and proved to be far more reliable and scalable than I could of hoped for initially.

Increasing Maximum Limits

I think this is probably out of date, but it's the most current info I could dig up on HP's site. Though this dates back to 2010. These pending changes are all about massively increasing the various supported maximum limits of various things. They didn't get into specifics. I think for most customers this won't really matter since they don't come close to the limits in any case(maybe someone from 3PAR will read this and send me more up to date info).

3PAR OS 2.3.1 supported limits

3PAR OS 2.3.1 supported limits(2010)

The PDF says updated May 2013, though the change log says last update is December. HP has put out a few revisions to the document(which is the Compatibility Matrix) which specifically address hardware/software compatibility, but the most recent Maximum Limits that I see are for what is now considered quite old - 2.3.1 release - this was before their migration to a 64-bit OS (3.1.1).

Compression / De-dupe

They didn't talk about it, other than mention it on a slide, but this is the first time I've seen HP 3PAR publicly mention the terms. Specifically they mention in-line de-dupe for file and block, as well as compression support. Again, no details.

Personally I am far more interested in compression than I am de-dupe. De-dupe sounds great for very limited workloads like VDI(or backups, which StoreOnce has covered already). Compression sounds like a much more general benefit to improving utilization.

Myself I already get some level of "de duplication" by using snapshots. My main 3PAR array runs roughly 30 MySQL databases entirely from read-write snapshots, part of the reason for this is to reduce duplicate data, another part of the reason is to reduce the time it takes to produce that duplicate data for a database(fraction of a second as opposed to several hours to perform a full data copy).

File + Object services directly on 3PAR controllers

No details here other than just mentioning having native file/object services onto the existing block services. They did mention they believe this would fit well in the low end segment, they don't believe it would work well at the high end since things can scale in different ways there. Obviously HP has file/object services in the IBRIX product (though HP did not get into specifics what technology would be used other than taking tech from several areas inside HP), and a 3PAR controller runs Linux after all, so it's not too far fetched.

I recall several years ago back when Exanet went bust, I was trying to encourage 3PAR to buy their assets as I thought it would of been a good fit. Exanet folks mentioned to me that 3PAR engineering was very protective of their stuff and were very paranoid about running anything other than the core services on the controllers, it is sensitive real estate after all.  With more recent changes such as supporting the ability to run their reporting software(System Reporter) directly on the controller nodes I'm not sure if this is something engineering volunteered to do themselves or not. Both approaches have their strengths and weaknesses obviously.

Where are 3PAR's SPC-2 results?

This is a question I asked them (again). 3PAR has never published SPC-2 results. They love to tout their SPC-1, but SPC-2 is not there....... I got a positive answer though: Stay tuned.  So I have to assume something is coming.. at some point. They aren't outright disregarding the validity of the test.

In the past 3PAR systems have been somewhat bandwidth constrained due to their use of PCI-X. Though the latest generation of stuff (7xxx/10xxx) all leverage PCIe.

The 7450 tops out at 5.2 Gigabytes/second of throughput, a number which they say takes into account overhead of a distributed volume system (it otherwise might be advertised as 6.4 GB/sec as a 2-node system does 3.2GB/sec). Given they admit the overhead to a distributed system now, I wonder how, or if, that throws off their previous throughput metrics of their past arrays.

I have a slide here from a few years ago that shows a 8-controller T800 supporting up to 6.4GB/sec of throughput, and a T400 having 3.2GB/sec (both of these systems were released in Q3 of 2008). Obviously the newer 10400 and 10800 go higher(don't recall off the top of my head how much higher).

This compares to published SPC-2 numbers from IBM XIV at more than 7GB/sec, as well as HP P9500/HDS VSP at just over 13GB/sec.

3PAR 7450

Announced more than a month ago now, the 7450 is of course the purpose built flash platform which is, at the moment all SSD.

Can it run with spinning rust?

One of the questions I had, was I noticed that the 7450 is currently only available in a SSD-only configuration. No spinning rust is supported. I asked why this was and the answer was pretty much what I expected. Basically they were getting a lot of flak for not having something that was purpose built. So at least in the short term, the decision not to support spinning rust is purely a marketing one. The hardware and software is the same(other than being more beefy in CPU & RAM) than the other 3PAR platforms. The software is identical as well. They just didn't want to give people more excuses to label the 3PAR architecture as something that wasn't fully flash ready.

It is unfortunate that the market has compelled HP to do this, as other workloads would still stand to gain a lot especially with the doubling up of data cache on the platform.

Still CPU constrained

One of the questions asked by someone was about whether or not the ASIC is the bottleneck in the 7450 I/O results. The answer was a resounding NO - the CPU is still the bottleneck even at max throughput. So I followed up with why did HP choose to go with 8 core CPUs instead of 10-core which Intel of course has had for some time. You know how I like more cores! The answer was two fold to this. The primary reason was cooling(the enclosure as is has two sockets, two ASICs, two PCIe slots, 24 SSDs, 64GB of cache and a pair of PSUs in 2U). The second answer was the system is technically Ivy-bridge capable but they didn't want to wait around for those chips to launch before releasing the system.

They covered a bit about the competition being CPU limited as well especially with data services, and the amount of I/O per CPU cycle is much lower on competing systems vs 3PAR and the ASIC.  The argument is an interesting one though at the end of the day the easy way to address that problem is throw more CPUs at it, they are fairly cheap after all. The 7000-series is really dense so I can understand the lack of ability to support a pair of dual socket systems within a 2U enclosure along with everything else. The 10400/10800 are dual socket(though older generation of processors).

TANGENT TIME

I really have not much cared for Intel's code names for their recent generation of chips. I don't follow CPU stuff all that closely these days(haven't for a while), but I have to say it's mighty easy to confuse code name A from B, which is newer? I have to look it up. every. single. time.

I believe in the AMD world (AMD seems to have given up on the high end, sadly), while they have code names, they have numbers as well. I know 6200 is newer than 6100 ..6300 is newer than 6200..it's pretty clear and obvious. I believe this goes back to Intel and them not being able to trademark the 486.

On the same note, I hate Intel continuing to re-use the code word i7 in laptops. I have an Core i7 laptop from 3 years ago, and guess what the top end today still seems to be? I think it's i7 still. Confusing. again.
</ END TANGENT >

Effortless SSD management of each SSD with proactive alerts

I wanted to get this in before going deeper into the cache optimizations since that is a huge topic. But the basic gist of this is they have good monitoring of the wear of the SSDs in the platform(something I think that was available on Lefthand a year or two ago), in addition to that the service processor (dedicated on site appliance that monitors the array) will alert the customer when the SSD is 90% worn out. When the SSD gets to 95% then the system pro-actively fails the drive and migrates data off of it(I believe). They raised a statistic that was brought up at Discover that something along the lines of 95% of all deployed SSDs in 3PAR were still in the field - very few have worn out. I don't recall anyone mentioning the # of SSDs that have been deployed on 3PAR but it's not an insignificant number.

SSD Caching Improvements in 3PAR OS 3.1.2

There have been a number of non trivial caching optimizations in the 3PAR OS to maximize performance as well as life span of SSDs. Some of these optimizations also benefit spinning rust configurations as well - I have personally seen a noticeable drop in latency in back end disk response time since I upgraded to 3.1.2 back in May(it was originally released in December), along with I believe better response times under heavy load on the front end.

Bad version numbers

I really dislike 3PAR's version numbering, they have their reasons for doing what they do, but I still think it is a really bad customer experience. For example going from 2.2.4 to 2.3.1 back in what was it 2009 or 2010. The version number implies minor update, but this was a MASSIVE upgrade.  Going from 2.3.x to 3.1.1 was a pretty major upgrade too (as the version implied).  3.1.1 to 3.1.2 was also a pretty major upgrade. On the same note the 3.1.2 MU2 (patch level!) upgrade that was released last month was also a major upgrade.

I'm hoping they can fix this in the future, I don't think enough effort is made to communicate major vs minor releases. The version numbers too often imply minor upgrades when in fact they are major releases. For something as critical as a storage system I think this point is really important.

Adaptive Read Caching

Adaptive Read Caching

3PAR Adaptive Read Caching for SSD (the extra bits being read there from the back end are to support the T10 Data Integrity Feature- available standard on all Gen4 ASIC 3PAR systems, and a capability 3PAR believes is unique in the all flash space for them)

One of the things they covered with regards to caching with SSD is the read cache is really not as effective(vs with spinning rust), because the back end media is so fast, there is significantly less need for caching reads. So in general, significantly more cache is used with writes.

For spinning rust 3PAR reads a full 16kB of data from the back end disk regardless of the size of the read on the front end (e.g. 4kB). This is because the operation to go to disk is so expensive already and there is no added penalty to grab the other 12kB while your grabbing the 4kB you need. The next I/O request might request part of that 12kB and you can save yourself a second trip to the disk when doing this.

With flash things are different. Because the media is so fast, you are much more likely to become bandwidth constrained rather than IOPS constrained.  So if for example you have that 500,000 4k read IOPS on the front end, and your performing those same 16kB read IOPS on the back end, that is, well 4x more bandwidth that is required to perform those operations. Again because the flash is so fast, there is significantly less penalty to go back to SSD again and again to retrieve those smaller blocks. It also improves latency of the system.

So in short, read more from disks because you can and there is no penalty, read only what you need from SSDs because you should and there is (almost) no penalty.

Adaptive Write Caching

Adaptive Write Caching

Adaptive Write Caching

With writes the situation is similar to reads, to maximize SSD life span, and minimize latency you want to minimize the number of write operations to the SSD whenever possible.

With spinning rust again 3PAR works with 16kB pages, if a 4kB write comes in then the full 16kB is written to disk, again  because there is no additional penalty for writing the 16kB vs writing 4kB. Unlike SSDs your not likely bandwidth constrained when it comes to disks.

With SSDs, the optimizations they perform, again to maximize performance and reduce wear, is if a 4kB write comes in, a 16kB write occurs to the cache, but only the 4kB of changed data is committed to the back end.

If I recall right they mentioned this operation benefits RAID 1 (anything RAID 1 in 3PAR is RAID 10, same for RAID 5 - it's RAID 50) significantly more than it benefits RAID 5/6, but it still benefits RAID 5/6.
 

Autonomic Cache offload

Autonomic Cache Offload

Autonomic Cache Offload

Here the system changes the frequency at which it flushes cache to back end media based on utilization. I think this plays a lot into the next optimization.

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Multi Tenant I/O Processing

3PAR has long been about multi tenancy of their systems. The architecture lends itself well to running in this mode though it wasn't perfect, I believe for the most part the addition of Priority Optimization that was announced late last year and finally released last month fills the majority of the remainder of that hole. I have run "multi tenant" 3PAR systems since the beginning. Now to be totally honest the tenants were all me, just different competing workloads, whether it is disparate production workloads or a mixture of production and non production(and yes in all cases they ran on the same spindles). It wasn't nearly as unpredictable as say a service provider with many clients running totally different things, that would sort of scare me on any platform. But there was still many times where rogue things (especially horrible SQL queries) overran the system (especially write cache). 3PAR handles it as well, if not better than anyone else but every system has it's limits.

Front end operations

The caching flushing process to back end media is now multi threaded. This benefits both SSD as well as existing spinning rust configurations. Significantly less(no?) locking involved when flushing cache to disk.

Here is a graph from my main 3PAR array, you can see the obvious latency drop from the back end spindles once 3.1.2 was installed back in May (again the point of this change was not to impact back end disk latency as much as it was to improve front end latency, but there is a significant positive behavior change post upgrade):

Latency Change on back end spinning rust with 3.1.2
Latency Change on back end spinning rust with 3.1.2

There was a brief time when latency actually went UP on the back end disks. I was concerned at first but later determined this was the disk defragmentation processes running(again with improved algorithms), before the upgrade they took FAR too long, post upgrade they completed a big backlog in a few days and latency returned to low levels.

Multi Tenant Caching

Multi Tenant Caching

Back end operations

On the topic of multi tenant with SSDs an interesting point was raised which I had never heard of before. They even called it out as being a problem specific to SSDs, and does not exist with spinning rust. Basically the issue is if you have two workloads going to the same set of SSDs, one of them issuing large I/O requests(e.g. sequential workload), and the other issuing small I/O requests(e.g. 4kB random read), the smaller I/O requests will often get stuck behind the larger ones causing increases in latency to the app using smaller I/O requests.

To address this, the 128kB I/Os are divided up into four 32kB I/O requests and sent in parallel to the other workload. I suppose I can get clarification but I assume for a sequential read operation with 128kB I/O request there must not be any additional penalty for grabbing the 32kB, vs splitting it up even further into even more smaller I/Os.
 
 
 

Maintaining performance during media failures

3PAR has always done wide striping, and sub disk distributed RAID so the rebuild times are faster, the latency is lower and all around things run better(no idle hot spares) that way vs the legacy designs of the competition. The system again takes additional steps now to maximize SSD life span by optimizing the data reads and writes under a failure condition.

HP points out that SSDs are poor at large sequential writes, so as mentioned above they divide the 128kB writes that would be issued during a rebuild operation (since that is largely a sequential operation) into 32kB I/Os again to protect those smaller I/Os from getting stuck behind big I/Os.

They also mentioned that during one of the SPC-1 tests (not sure if it was 7400 or 7450) one of the SSDs failed and the system rebuilt itself. They said there was no significant performance hit(as one might expect given experience with the system) as the test ran. I'm sure there was SOME kind of hit especially if you drive the system to 100% of capacity and suffer a failure. But they were pleased with the results regardless. The competition would be lucky to have something similar.

What 3PAR is not doing

When it comes to SSDs and caching something 3PAR is not doing, is leveraging SSDs to optimize back end I/Os to other media as sequential operations. Some storage startups are doing this to gain further performance out of spinning rust while retaining high random performance using SSD. 3PAR doesn't do this and I haven't heard of any plans to go this route.

Conclusion

I continue to be quite excited about the future of 3PAR, even more so pre acquisition. HP has been able to execute wonderfully on the technology side of things. Sales from all accounts at least on the 7000 series are still quite brisk. Time will tell if things hold up after EVA is completely off the map, but I think they are doing many of the right things. I know even more of course but can't talk about it here(yet)!!!

That's it for tonight, at ~4,000 (that number keeps going up, I should goto bed) words this took three hours or more to write+proof read, it's also past 2AM. There is more to cover, the 3PAR stuff was obviously what I was most interested in. I have a few notes from the other sessions but they will pale in comparison to this.

Today I had a pretty good idea on how HP could improve it's messaging around whether to choose 3PAR or StoreVirtual for a particular workload. The messaging to-date to me has been very confusing and conflicting (HP tried to drive home a point about single platforms and reducing complexity, something this dual message seems to conflict with).  I have been communicating with HP off and on for the past few months, and today out of the blue I  came up with this idea which I think will help clear the air. I'll touch on this soon when I cover the other areas that were talked about today.

Tomorrow seems to be a busy day, apparently we have front row seats, and the only folks with power feeds. I won't be "live blogging"(as some folks tend to love to do), I'll leave that to others. I work better at spending some time to gather thoughts and writing something significantly longer.

If you are new to this site you may want to check out a couple of these other articles I have written about 3PAR(among the dozens...)

Thanks for reading!

29Jul/13Off

HP Storage Tech Day Live

TechOps Guy: Nate

In about seven and a half hours the HP tech day event will be starting, I thought it was going to be a private event but it looks like they will be broadcasting it via one of the folks here is adept at that sort of thing.

If your interested, the info is here. It starts at 8AM Pacific. Fortunately it's literally downstairs from my hotel room.

Topics include

  • 3par deep dive (~3 hrs worth)
  • StoreVirtual  (Lefthand VSA), StoreOnce VSA, and Open stack integration
  • StoreAll (IBRIX), express query(Autonomy)

No Vertica stuff.. Vertica doesn't get any storage people excited since it is so fast and reduces I/O by so much.. so you don't need really fancy storage stuff to make it fly.

HP asked that I put this notice on these posts so the FCC doesn't come after us..

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

(Many folks have accused me of being compensated by 3PAR and/or HP in the past based on what I have written here but I never have been - by no means do I love HP as a whole there are really only a few product lines that have me interested which is 3PAR, Proliant, and Vertica - ironically enough each of those came to HP via acquisition). I have some interest in StoreOnce though have yet to use it. (Rust in Peace WebOS -- I think I will be on Android before the end of the year - more on that later..)

I'm pretty excited about tomorrow (well today given that it's almost 1AM), though getting up so early is going to be a challenge!

Apparently I'm the only person in the group here that is not on twitter. I don't see that changing anytime soon. Twitter and Facebook are like the latest generation of Star Trek movies, they basically don't exist to me.

The one thing that I am sort of curious about is what, if any plans HP has for the P9500 range, they don't talk about it much.. I'm sure they won't come out and say they are retiring it any time soon, since it's still fairly popular with select customers. I just want to try to get them to say something about it, I am curious.

(this is my first trip to a vendor-supported event that included travel)