TechOpsGuys.com Diggin' technology every day

10Aug/13Off

The Myth of online backup and the future of my mobility

TechOps Guy: Nate

I came across this article on LinkedIn which I found very interesting. The scenario given by the article was a professional photographer had 500GB of data to backup and they decided to try Carbonite to do it.

The problem was Carbonite apparently imposes significant throttling on the users uploading large amounts of data -

[..]At that rate, it takes nearly two months just to upload the first 200GB of data, and then another 300 days to finish uploading the remaining 300GB.

Which takes me back to a conversation I was having with my boss earlier in the week about why I decided to buy my own server and put it in a co-location facility, instead of using some sort of hosted thing.

I have been hosting my own websites, email etc since about 1996. At one point I was hosted on T1s at an office building, then I moved things to my business class DSL at home for a few years, then when that was no longer feasible I got a used server and put it up at a local colo in Seattle. Then I decided to retire that old server(build in 2004) and spent about a year in the Terremark vCloud, before buying a new server and putting it up at a colo in the Bay area where I live now.

My time in the Terremark cloud was OK, my needs were pretty minimal, but I didn't have a lot of flexibility(due to the costs). My bill was around $120/mo or something like that for a pair of VMs. Terremark operates in a Tier 4 facility and doesn't use the built to fail model I hate so much, so I had confidence things would get fixed if they ever broke, so I was willing to pay some premium for that.

Cloud or self hosting for my needs?

I thought hard about whether or not to invest in a server+colo again or stay on some sort of hosted service. The server I am on today was $2,900 when I bought it, which is a decent amount of money for me to toss around in one transaction.

Then I had the idea of storing data off site, I don't have much that is critical, mostly media files and stuff that would take a long time to re-build in case of major failure or something. But I wanted something that could do at least 2-3TB of storage.

So I started looking into what this would cost in the cloud. I was sort of shocked I guess you could say. The cost for regular, protected cloud storage was going to easily be more than $200/mo for 3TB of usable space.

Then there are backup providers like Carbonite, Mozy, Backblaze etc.. I read a comment on Slashdot I think it was about Backblaze and was pretty surprised to then read their fine print -

Your external hard drives need to be connected to your computer and scanned by Backblaze at least once every 30 days in order to keep them backed up.

So the data must be scanned at least once every 30 days or it gets nuked.

They also don't support backing up network drives. Most of the providers of course don't support Linux either.

The terms do make sense to me, I mean it costs $$ to run, and they advertise unlimited. So I don't expect them to be storing TBs of data for only $4/mo. It just would be nice if they (and others) would be more clear on their limitations up front, at least unlike the person in the article above I was able to make a more informed decision.

The only real choice: Host it myself

So the decision was really simple at that point. Go invest and do it myself. It's sort of ironic if you think about it, all this talk about cloud saving people money. Here I am, just one person, with no purchasing power whatsoever and I am saving more money doing it myself then some massive scale service provider can offer it.

The point wasn't just the storage though. I wanted something to host:

  • This blog
  • My email
  • DNS
  • my other websites / data
  • would be nice if there was a place to experiment/play as well

So I bought this server which is a single socket quad core Intel chip, originally with 8GB, now it has 16GB of memory, and 4x2TB SAS disks in RAID 1+0(~3.6TB usable) w/3Ware hardware RAID controller(I've been using 3Ware since 2001). It has dual power supplies(though both are connected to the same source, my colo config doesn't offer redundant power). It even has out of band management with full video KVM and virtual media options. Nothing like the quality of HP iLO, but far better than what a system of this price point could offer going back a few years ago.

On top of that I am currently running 5 active VMs

  • VM #1 runs my personal email, DNS,websites, this blog etc
  • VM #2 runs email for a few friends, and former paying customers(not sure how many are left) from an ISP that we used to run many years ago, DNS, websites etc
  • VM #3 is a OpenBSD firewall running in layer 3 mode, also provides site to site VPN to my home, as well as a end-user VPN for my laptop when I'm on the road)
  • VM #4 acts as a storage/backup server for my home data with a ~2TB file system
  • VM #5 is a windows VM in case I need one of those remotely. It doesn't get much use.
  • VM #6 is the former personal email/dns/website server that ran a 32-bit OS. Keeping it around on an internal IP for a while in case I come across more files that I forgot to transfer.

There is an internal and an external network on the server, the site to site VPN of course provides unrestricted access to the internal network from remote which is handy since I don't have to rely on external IPs to run additional things. The firewall also does NAT for devices that are not on external IPs.

Obviously as you might expect the server sits at low CPU usage 99% of the time and it's running at around 9GB of used memory, so I can toss on more VMs if needed. It's obviously a very flexible configuration.

When I got the server originally I decided to host it with the company I bought it from,  and they charged me $100/mo to do it. Unlimited bandwidth etc.. good deal(also free on site support)!  First thing I did was take the server home and copy 2TB of data onto it. Then I gave it back to them and they hosted it for a year for me.

Then they gave me the news they were going to terminate their hosting and I had only two weeks to get out. I evaluated my options and decided to stay at the same facility but started doing business with the facility itself (Hurricane Electric). The down side was the cost was doubling to $200/mo for the same service (100Mbit unlimited w/5 static IPs), since I was no longer sharing the service with anyone else. I did get a 3rd of a rack though, not that I can use much of it due to power constraints(I think I only get something like 200W). But in the grand scheme of things it is a good deal, I mean it's a bit more than double what I was paying in the Seattle area but I am getting literally 100 times the bandwidth. That gives me a lot of opportunities to do things. I've yet to do much with it beyond my original needs, that may change soon though.

Now granted it's not high availability, I don't have 3PAR storage like Terremark did when I was a customer, I have only 1 server so if it's down everything is down.  It's been reliable though, providing really good uptime over the past couple of years. I have had to replace at least two disks, and I also had to replace the USB stick that runs vSphere the previous one seemed to have run out of flash blocks as I could no longer write much to the file system. That was a sizable outage for me as I took the time to install vSphere 5.1 (from 4.x) on the new USB stick, re-configure things as well as upgrade the memory all in one day, took probably 4-5 hours I think. I'm connected to a really fast backbone and the network has been very reliable (not perfect, but easily good enough).

So my server was $2,900, and I pay currently $2,400/year for service. It's certainly not cheap, but I think it's a good deal still relative to other options. I maintain a very high level of control, I can store a lot of data, I can repair the system if it breaks down, and the solution is very flexible, I can do a lot of things with the virtualization as well as the underlying storage and the high bandwidth I have available to me.

Which brings me to next steps, something I've always wanted to do is make the data more mobile, that is one area which it was difficult(or impossible) to compete with cloud services, especially on things like phones and tablets. Since they have the software R&D to make those "apps" and other things.

I have been using WebOS for several years now, which of course runs on top of Linux. Though the underlying Linux OS is really too minimal to be of any use to me. It's especially useless on the phone where I am just saddened that there has never been a decent terminal emulation app released for WebOS. Of all the things that could be done, that one seems really trivial. But it never happened(that I could see, there were a few attempts but nothing usable as far as I could tell). On the touchpad things were a little different, you could get an Xterm and it was kind of usable, significantly more so than the phone. But still the large overhead of X11 just to get a terminal seemed quite wasteful. I never really used it very much.

So I have this server, and all this data sitting on a fast connection but I didn't have a good way to get to it remotely unless I was on my laptop (except for the obvious like the blog etc are web accessible).

Time to switch to new mobile platform

WebOS is obviously dead(RIP), in the early days post termination of the hardware unit at HP I was holding out some hope for the software end of things but that hope has more or less dropped to 0 now, nothing remains but disappointment of what could of been. I think LG acquiring the WebOS team was a mistake and even though they've announced a WebOS-powered TV to come out early next year, honestly I'll be shocked if it hits the market. It just doesn't make any sense to me to run WebOS on a TV outside of having a strong ecosystem of other WebOS devices that you can integrate with.

So as reality continued to set in, I decided to think about alternatives, what was going to be my next mobile platform. I don't trust Google, don't like Apple. There's Blackberry and Windows Phone as the other major brands in the market. I really haven't spent any time on any of those devices. So I suppose I won't know for sure but I did feel that Samsung had been releasing some pretty decent hardware + software (based on stuff I have read only), and they obviously have good market presence.  Some folks complain etc.. If I were to go to a Samsung Android platform I probably wouldn't have an issue. Those complaining about their platform probably don't understand the depression that WebOS has been in since about 6 months after it was released - so really anything relative to that is a step up.

I mean I can't even read my personal email on my WebOS device without using the browser. Using webmail via the browser on WebOS for me at least is a last resort thing, I don't do it often(because it's really painful - I bought some skins for the webmail app I use that are mobile optimized only to find they are not compatible with WebOS so when on WebOS I use a basic html web mail app, it gets the job done but..). The reason I can't use the native email client is I suppose in part my fault, the way I have my personal email configured is I have probably 200 email addresses and many of them go directly to different inboxes. I use Cyrus IMAP and my main account subscribes to these inboxes on demand. If I don't care about that email address I unsubscribe and it continues to get email in the background. WebOS doesn't support accessing folders via IMAP outside of the INBOX structure of a single account. So I'm basically SOL for accessing the bulk of my email (which doesn't go to my main INBOX). I have no idea if Samsung or Android works any different.

The browser on the touchpad is old and slow enough that I keep javascript disabled on it, I mean it's just a sad decrepit state for WebOS these days(and has been for almost two years now). My patience really started running out recently when loading a 2-page PDF on my HP Pre3, then having the PDF reader constantly freeze (unable to flip between pages, though the page it was on was still very usable) if I let it sit idle for more than a couple of minutes (have to restart the app).  This was nothing big, just a 2-page PDF the phone couldn't even handle that.

I suppose my personal favorite problem is not being able to use bluetooth and 2.4Ghz wifi at the same time on my phone. The radios conflict, resulting in really poor quality over bluetooth or wifi or both. So wifi stays disabled the bulk of the time on my phone since most hotspots seem only to do 2.4Ghz, and I use bluetooth almost exclusively when I make voice calls.

There are tons of other pain points for me on WebOS, and I know they will never get fixed, those are just a couple of examples. WebOS is nice in other ways of course, I love the Touchstone (inductive charging) technology for example, the cards multitasking interface is great too(though I don't do heavy multi tasking).

So I decided to move on. I was thinking Android, I don't trust Google but, ugh, it is Linux based and I am a Linux user(I do have some Windows too but my main systems desktops, laptops are all Linux) and I believe Windows Phone and BlackBerry would likely(no, certainly) not play as well with Linux as Android. (WebOS plays very well with Linux, just plug it in and it becomes a USB drive, no restrictions - rooting WebOS is as simple as typing a code into the device). There are a few other mobile Linux platforms out there, I think Meego(?) might be the biggest trying to make a come back, then there is FirefoxOS and Ubuntu phone.. all of which feel less viable(in today's market) than WebOS did back in 2009 to me.

So I started thinking more about leaving WebOS, and I think the platform I will go to will be the Samsung Galaxy Note 3, some point after it comes out(I have read ~9/4 for the announcement or something like that). It's much bigger than the Pre3, not too much heavier(Note 2 is ~30g heavier). Obviously no dedicated keyboard, I think the larger screen will do well for typing with my big hands. The Samsung multimedia / multi tasking stuff sounds interesting(ability to run two apps at once, at least Samsung apps).

I do trust Samsung more than Google, mainly because Samsung wants my $$ for their hardware. Google wants my information for whatever it is they do..

I'm more than willing to trade money in a vein attempt to maintain some sort of privacy. In fact I do it all the time, I suppose that could be why I don't get much spam to my home address(snail mail). I also very rarely get phone calls from marketers(low single digits per year I think), even though I have never signed up to any do not call lists(I don't trust those lists).

Then I came across this comment on Slashdot -

Well I can counter your anecdote with one of my own. I bought my Galaxy S3 because of the Samsung features. I love multi-window, local SyncML over USB or WiFi so my contacts and calendar don't go through the "cloud", Kies Air for accessing phone data through the browser, the Samsung image gallery application, the ability to easily upgrade/downgrade/crossgrade and even load "frankenfirmware" using Odin3, etc. I never sign in to any Google services from my phone - I've made a point of not entering a Google login or password once.

So, obviously, I was very excited to read that.

Next up, and this is where the story comes back around to online backup, cloud, my co-lo, etc.. I didn't expect the post to be this long but it sort of got away from me again..

I think it was on another Slashdot comment thread actually (I read slashdot every day but never have had an account and I think I've only commented maybe 3 times since the late 90s), where someone mentioned the software package Owncloud.

Just looking at the features, once again got me excited. They also have Android and IOS apps. So this would, in theory, from a mobile perspective allow me to access files, sync contacts, music, video, perhaps even calendar(not that I use one outside of work which is Exchange) and keep control over all of it myself. Also there are desktop sync clients (ala dropbox or something like that??) for Linux, Mac, and Windows.

So I installed it on my server, it was pretty easy to setup, I pointed it to my 2TB of data and off I went. I installed the desktop sync client on several systems(Ubuntu 10.04 was the most painful to install to, had to compile several packages from source but it's nothing I haven't done a million times before on Linux). The sync works well (had to remove the default sync which was to sync everything, at first it was trying to sync the full 2TB of data, and it kept failing, not that I wanted to sync that much...I configured new sync directives for specific folders).

So that's where I'm at now. Still on WebOS, waiting to see what comes of the new Note 3 phone, I believe I saw for the Note 2 there was even a custom back cover which allowed for inductive charging as well.

It's sad to think of the $$ I dumped on WebOS hardware in the period of panic following the termination of the hardware division, I try not to think about it ..... The touchpads do make excellent digitial picture frames especially when combined with a touchstone charger.  I still use one of my touchpads daily(I have 3), and my phone of course daily as well. Though my data usage is quite small on the phone since there really isn't a whole lot I can do on it, unless I'm traveling and using it as a mobile hot spot.

whew, that was a lot of writing.

31Jul/13Off

HP Storage Tech Day – StoreAll, StoreVirtual, StoreOnce

TechOps Guy: Nate

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

On Monday I attended a private little HP Storage Tech Day event here in Anaheim for a bunch of bloggers. They also streamed it live, and I believe the video is available for download as well.

I wrote a sizable piece on the 3PAR topics which were covered in the morning, here I will try to touch on the other HP Storage topics.

HP StoreAll + Express Query

StoreAll Platform

HP doesn't seem to talk about this very much, and as time goes on I have understood why. It's not a general purpose storage system, I suppose it never has been (though I expect Ibrix tried to make it one in their days of being independent). They aren't going after NetApp or EMC's enterprise NAS offerings. It's not a platform you want to run VMs on top of. Not a platform you want to run databases on top of. It may not even be a platform you want to run HPC stuff on top of. It's built for large scale bulk file and object storage.

They have tested scalability to 1,024 nodes and 16PB within a single name space. The system can scale higher, that's just the tested configuration. They say it can scale non disruptively and re-distribute existing data across new resources as those resources are added to the system. StoreAll can go ultra dense with near line SAS drives going up to roughly 500 drives in a rack (without any NAS controllers).

It's also available in a gateway version which can go in front of 3PAR, EVA and XP storage.

They say their main competition is EMC Isilon, which is another scale-out offering.

HP seems to have no interest in publishing performance metrics, including participating in SPECsfs (a benchmark that sorely lacks disclosures). The system has no SSD support at all.

The object store and file store, if I am remembering right, are not shared. So you have to access your data via a consistent means. You can't have an application upload data to an object store then give a user the ability to browse to that file using CIFS or NFS. To me this would be an important function to serve if your object and file stores are in fact on the same platform.

By contrast, I would expect a scale out object store to do away with the concept of disk-based RAID and go with object level replication instead. Many large scale object stores do this already. I believe I even read in El Reg that HP Labs is working on something similar(nothing around that was mentioned at the event). In StoreAll's case they are storing your objects in RAID, but denying you the flexibility to access them over file protocols which is unfortunate.

From a competitive standpoint, I am not sure what features HP may offer that are unique from an object storage perspective that would encourage a customer to adopt StoreAll for that purpose. If it were me I would probably take a good hard look at something like Red Hat Storage server(I would only consider RHSS for object storage, nothing else) or other object offerings if I was to build a large scale object store.

Express Query (below) cannot currently run on object stores at this time, it will with a future release though.

Express Query

This was announced a while back, which is what seems to be a SQL database of sorts that is running on the storage controllers, with some hooks into the file system itself. It provides indexes of common attributes as well as gives the user the ability to provide custom attributes to search by. As a result, obviously you don't have to search the entire file system to find files that match these criteria.

It is exposed as a Restful API which has it's ups and downs. As an application developer you can take this and tie it into your application. It returns results in JSON format (developer friendly, hostile to users such as myself - remember my motto "if it's not friends with grep or sed, it's not friends with me").

The concept is good, perhaps the implementation could use some more work, as-is it seems very developer oriented. There is a java GUI app which you can run that can help you build and submit a query to the system which is alright. I would like to see a couple more things:

  • A website on the storage system (with some level of authentication - you may want to leave some file results open to the "public" if those are public shares) that allows users to quickly build a query using a somewhat familiar UI approach.
  • A drop in equivalent to the Linux command find. It would only support a subset of functionality but you could get a large portion of that functionality I believe fairly simply with this. The main point being don't make the users have to make significant alterations to their processes to adopt this feature, it's not complicated, lower the bar for adoption.

To HP's credit they have written some sort of plugin to the Windows search application that gives windows users the ability to easily use Express Query(I did not see this in action). Nothing so transparent exists for Linux though.

My main questions though were things HP was unable to answer. I expected more from HP on this front in general. I mean specifically around competitive info. In many cases they seem to rely on the publicly available information on the competition's platforms - maybe limited to the data that is available on the vendor website. HP is a massive organization with very deep pockets - you may know where I'm going here.  GO BUY THE PRODUCTS! Play with them, learn how they work, test them, benchmark them. Get some real world results out of the systems. You have the resources, you have the $$. I can understand back when 3PAR was a startup company they may not be able to go around buying arrays from the competition to put them through their paces. Big 'ol HP should be doing that on a regular basis. Maybe they are -- if they are -- they don't seem to be admitting that their data is based on that(in fact I've seen a few times where they explicitly say the information is only from data sheets etc).

Another approach might be, if HP lacks the man power to run such tests and stuff, to have partners or customers do it for them. Offer to subsidize the cost of some purchase by some customer of some competitive equipment in exchange for complete open access to competitive information as a result of using such a system. Or fully cover the cost.. HP has the money to make it happen. It would be really nice to see..

So in regards to Express Query I had two main questions about performance related to the competition. HP says they view Isilon as the main competition for StoreAll.  A couple of years back Isilon started offering a product(maybe it is standard across the board now I am not sure) where they stored the metadata in SSD. This would dramatically accelerate these sorts of operations, without forcing the user to change their way of doing things. Lowers the bar of adoption. Now price wise it probably costs more, StoreAll does not have any special SSD support whatsoever. But I would be curious as to the performance in general comparing Isilon's accelerated metadata vs HP Express query.  Obviously Express Query is more flexible with it's custom meta data fields and tagging etc, so for specific use cases there is no comparison. BUT.. for many things I think both would work well..

Along the same notes - back when I was a BlueArc customer one of their big claims was their FPGA accelerated file system had lightning fast meta data queries. So how does Express Query performance compare to something like that? I don't know, and got no answer from HP.

Overall

Overall, I'd love it if HP had a more solid product in this area, it feels like whoever I talk to that they have just given up trying to be competitive with an in house enterprise file offering(they do have a file offering based on Windows storage server but I don't really consider that in the same ballpark since they are just re-packaging someone else's product). HP StoreAll has it's use cases and it probably does those fairly well, but it's not a general purpose file platform, and from the looks of things it's never going to be one.

Software Defined Storage

Just hearing the phrase Software Defined <anything> makes me shudder. Our friends over at El Reg have started using the term hype gasm when referring to Software Defined NetworkingI believe the SDS space is going to be even worse, at least for a while.

(On a side note there was an excellent presentation on SDN at the conference today which I will write about once I have the password to unlock the presentations so I can refresh my memory on what was on the slides - I may not get the password until Friday though)

As of today, the term is not defined at all. Everyone has their own take on it, and that pisses me off as a technical person. From a personal standpoint I am in the camp leaning more towards some separation of data and control planes ala SDN, but time will tell what it ends up being.

I think Software Defined Storage, if it is not some separation of control and data plan instead could just be a subsystem of sorts that provides storage services to anything that needs them. In my opinion it doesn't matter if it's from a hardware platform such as 3PAR, or a software platform such as a VSA. The point is you have a pool of storage which can be provisioned in a highly dynamic & highly available manor to whatever requests it. At some point you have to buy hardware - having infrastructure that is purpose built, and shared is obviously a commonly used strategy today. The level of automation and API type stuff varies widely of course.

The point here is I don't believe the Software side of things means it has to be totally agnostic as to where it runs - it just needs a standard interfaces in which anything can address it(APIs, storage protocols etc). It's certainly nice to have the ability to run such a storage system entirely as a VM, there are use cases for that for certain. But I don't want to limit things to just that. So perhaps more focus on the experience the end user gets rather than how you get there. Something like that.

StoreVirtual

HP's take on it is of course basically storage resources provisioned through VSAs. Key points being:

  • Software only (they also offer a hardware+software integrated package so...)
  • Hypervisor agnostic (right now that includes VMware and Hyper-V so not fully agnostic!)
  • Federated

I have been talking with HP off and on for months now about how confusing I find their messaging around this.

In one corner we have:

3PAR Eliminating boundaries.

3PAR Eliminating boundaries.

In the other corner we have

Software Defined Data Center - Storage

Software Defined Data Center - Storage (3PAR is implied to be Service Refined Storage - Storevirtual is Cost Optimized)

Store Virtual key features

Store Virtual key features

Mixed messages

(thinking from a less technical user's perspective - specifically thinking back to some of my past managers who thought they knew what they were doing when they really didn't - I'm looking out for other folks like me in the field who don't want their bosses to get the wrong message when they see something like this)

What's the difference between 3PAR's SLA Optimized storage when value matters, and StoreVirtual Cost Optimized?

Hardware agnostic and federated sounds pretty cool, why do I need 3PAR when I can just scale out with StoreVirtual? Who needs fancy 3PAR Peer Persistence (fancy name for transparent full array high availability) when I have built in disaster recovery on the StoreVirtual platform?

Expensive 3PAR software licensing? StoreVirtual is all inclusive! The software is the same right? I can thin provision, I can snapshot, I can replicate, I can peer motion between StoreVirtual systems. I have disaster recovery, I have iSCSI, I have Fibre channel. I have scale out and I have a fancy shared-nothing design. I have Openstack integration. I have flash support, I have tiering, I have all of this with StoreVirtual. Why did HP buy 3PAR when they already have everything they need for the next generation of storage?

(stepping back into technical role now)

Don't get me wrong - I do see some value in the StoreVirtual platform! It's really flexible, easy to deploy, and can do wonders to lower costs in certain circumstances - especially branch office type stuff. If you can deploy 2-3 VM servers at an edge office and leverage HA shared storage without a dedicated array I think that is awesome.

But the message for data center and cloud deployments - do I run StoreVirtual as my primary platform or run 3PAR ?  The messaging is highly confusing.

My idea to HP on StoreVirtual vs. 3PAR

I went back and forth with HP on this and finally, out of nowhere I had a good idea which I gave to them and it sounds like they are going to do something with it.

So my idea was this - give the customer a set of questions, and based on the answers of those questions HP would know which storage system to recommend for that use case. Pretty simple idea. I don't know why I didn't come up with it earlier (perhaps because it's not my job!). But it would go a long way in cleaning up that messaging around which platform to use. Perhaps HP could take the concept even further and update the marketing info to include such scenarios (I don't know how that might be depicted, assuming it can be done so in a legible way).

When I gave that idea, several people in the room liked it immediately, so that felt good :)

 HP StoreOnce

(This segment of the market I am not well versed in at all, so my error rate is likely to go up by a few notches)

HP StoreOnce is of course their disk-based scale-out dedupe system developed by HP Labs. One fairly exciting offering in this area that was recently announced at HP Discover is the StoreOnce VSA. Really nice to see the ability to run StoreOnce as a VM image for small sites.

They spent a bunch of time talking about how flexible the VSA is, how you can vMotion it and Storage vMotion it like it was something special. It's a VM, it's obvious you should be able to do those things without a second thought.

StoreOnce is claimed to have a fairly unique capability of being able to replicate between systems without ever re-hydrating the data. They also claim a unique ability to be the first platform to offer real high availability. In a keynote session by David Scott (which I will cover in more depth in a future post once I get copies of those presentations) he mentioned that Data Domain as an example, if a DD controller fails during a backup or a restore operation the operation fails and must be restarted after the controller is repaired.

This surprised me - what's the purpose of dual controllers if not to provide some level of high availability? Again forgive my ignorance on the topic as this is not an area of the storage industry I have spent much time at all in.

HP StoreOnce however can recover from this without significant downtime. I believe the backup or restore job still must be restarted from scratch, but you don't have to wait for the controller to be repaired to continue with work.

HP has claimed for at least the past year to 18 months now that their performance far surpasses everyone else by a large margin, they continued those claims this week. I believe I read at one point from their competition that the claims were not honest in that I believe the performance claims was from a clustered StoreOnce system which has multiple de-dupe domains(meaning no global dedupe on the system as a whole), and it was more like testing multiple systems in parallel against a single data domain system(with global dedupe). I think there were some other caveats as well but I don't recall specifics (this is from stuff I want to say I read 18-20 months ago).

In any case, the product offering seems pretty decent, is experiencing a good amount of growth and looks to be a solid offering in the space. Much more competitive in the space than StoreAll is, probably not quite as competitive as 3PAR, perhaps a fairly close 2nd as far as strength of product offering.

Next up, converged storage management and Openstack with HP. Both of these topics are very light relative to the rest of the material, I am going to go back to the show room floor to see if I can dig up more info.

 

29Jul/13Off

HP Storage Tech Day Live

TechOps Guy: Nate

In about seven and a half hours the HP tech day event will be starting, I thought it was going to be a private event but it looks like they will be broadcasting it via one of the folks here is adept at that sort of thing.

If your interested, the info is here. It starts at 8AM Pacific. Fortunately it's literally downstairs from my hotel room.

Topics include

  • 3par deep dive (~3 hrs worth)
  • StoreVirtual  (Lefthand VSA), StoreOnce VSA, and Open stack integration
  • StoreAll (IBRIX), express query(Autonomy)

No Vertica stuff.. Vertica doesn't get any storage people excited since it is so fast and reduces I/O by so much.. so you don't need really fancy storage stuff to make it fly.

HP asked that I put this notice on these posts so the FCC doesn't come after us..

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

(Many folks have accused me of being compensated by 3PAR and/or HP in the past based on what I have written here but I never have been - by no means do I love HP as a whole there are really only a few product lines that have me interested which is 3PAR, Proliant, and Vertica - ironically enough each of those came to HP via acquisition). I have some interest in StoreOnce though have yet to use it. (Rust in Peace WebOS -- I think I will be on Android before the end of the year - more on that later..)

I'm pretty excited about tomorrow (well today given that it's almost 1AM), though getting up so early is going to be a challenge!

Apparently I'm the only person in the group here that is not on twitter. I don't see that changing anytime soon. Twitter and Facebook are like the latest generation of Star Trek movies, they basically don't exist to me.

The one thing that I am sort of curious about is what, if any plans HP has for the P9500 range, they don't talk about it much.. I'm sure they won't come out and say they are retiring it any time soon, since it's still fairly popular with select customers. I just want to try to get them to say something about it, I am curious.

(this is my first trip to a vendor-supported event that included travel)

2Dec/11Off

New record holder for inefficient storage – VMware VSA

TechOps Guy: Nate

I came across this article last night and was honestly pretty shocked, it talks about the limitations of the new VMware Virtual Storage Appliance that was released along side vSphere 5. I think it is the second VSA to receive full VMware certification after the HP/Lefthand P4000.

The article states

[..]
Plus, this capacity will be limited by a 75% storage overhead requirement for RAID data protection. Thus, a VSA consisting of eight 2 TBs would have a raw capacity of 16 TB, but the 75% redundancy overhead would result in a maximum usable capacity of 4 TB.

VMware documentation cites high availability as the reason behind VSA’s capacity limitations: “The VSA cluster requires RAID10 virtual disks created from the physical disks, and the vSphere Storage Appliance uses RAID1 to maintain the VSA datastores’ replicas,” resulting in effective capacity of just 25% of the total physical hard disk capacity.
[..]

That's pretty pathetic! Some folks bang on NetApp for being inefficient in space, I've ragged on a couple of other folks for the same, but this VSA sets a new standard. Well there is this NEC system with 6%, though in NEC's case that was by choice. The current VSA architecture forces the low utilization on you whether you want it or not.

I don't doubt that VMware released the VSA "because they could", I'm sure they designed it primarily for their field reps to show off the shared storage abilities of vSphere from laptops and stuff like that (that was their main use of the Lefthand VSA when it first came out at least), given how crippled the VSA is(it doesn't stop at low utilization see the article for more), I can't imagine anyone wanting to use it - at any price.

The HP Lefthand VSA seems like a much better approach - it's more flexible, has more fault tolerance options, and appears to have an entry level price of about half that of the VMware VSA.

The only thing less efficient that I have come across is utilization in Amazon EC2 - where disk utilization rates in the low single digits are very common due to the broken cookie cutter design of the system.

9Mar/11Off

Next Gen COPAN

TechOps Guy: Nate

About a year or so ago SGI bought COPAN for what seemed like fractional pennies on the dollar, well they recently came out with the next generation of COPAN and I'm still amazed at how much storage they can fit in a rack.

ArcFiniti comes in 5 factory-configured models to suit any archive environment. Lower-capacity models can be upgraded to higher capacity, maxing out at just over 1.4PB of usable archive in a single rack.

Full specifications don't seem to be disclosed at the moment, the original COPAN systems topped out at a hefty 3,000 pounds per rack, the only storage system that I had heard of that weighed in more than 3PAR (about 2,000 pounds max per rack).

The original systems kept roughly 75% of the drives spun down at any given point.

 

Tagged as: , Comments Off
28Oct/10Off

Compellent beats expectations

TechOps Guy: Nate

Earlier in the year Compellent's stock price took a big hit following lower expectations for sales and stuff, a bunch of legal stuff followed that, it seems yesterday they redeemed themselves though with their stock going up nearly 33% after they tripled their profits or something.

I've had my eye on Compellent for a couple of years now, don't remember where I first heard about them. They have similar technology to 3PAR, just it's implemented entirely in software using Intel CPUs as far as I know vs 3PAR leveraging ASICs (3PAR has Intel CPUs too but they aren't used for too much).

I have heard field reports that because of this that their performance is much more on the lower end of things, they have never published a SPC-1 result and I don't know anyone that uses them so don't know how they really perform.

They seem to use the same Xyratex enclosures that most everyone else uses. Compellent's controllers do seem to be somewhat on the low end of things, I really have nothing other to go on other than cache. With their high end controller coming in at only 3.5GB of cache (I assume 7GB mirrored for a pair of controllers?) it is very light on cache. The high end has a dual core 3.0Ghz CPU.

The lower amount of cache combined with their software-only design and only two CPUs per controller and the complex automated data movement make me think the systems are built for the lower end and not as scalable, but I'm sure perfectly useful for the market they are in.

Would be nice to see how/if their software can scale if they were to put say a pair of 8 or 12 core CPUs in their controllers. After all since they are leveraging x86 technology performing such an upgrade should be pretty painless! Their controller specs have remained the same for a while now(as far back as I can remember). The bigger CPUs will use more power, but from a storage perspective I'm happy to give a few hundred more watts if I can get 5x+ the performance, don't have to think once, yet alone twice.

They were, I believe the first to have automagic storage tiering and for that they deserve big props, though again no performance numbers posted (that I am aware of) that can illustrate the benefits this technology can bring to the table. I mean if anybody can prove this strategy works it should be them right? On paper it certainly sounds really nice but in practice I don't know, haven't seen indications that it's as ready as the marketing makes it out to be.

My biggest issue with automagic storage tiering is how fast the array can respond to "hot" blocks and optimize itself, which is why I think from a conceptual perspective I really like the EMC Fast Cache approach more (they do have FAST LUN and sub LUN tiering too). Not that I have any interest in using EMC stuff but they do have cool bits here and there.

Maybe Compellent the next to get bought out (as a block storage company yeah I know they have their zNAS), I believe from a technology standpoint they are in a stronger position than the likes of Pillar or Xiotech.

Anyway that's my random thought of the day

10Oct/10Off

Intel or ASIC

TechOps Guy: Nate

Just another one of my random thoughts I have been having recently.

Chuck wrote a blog not too long ago how he believes everyone is going to go to Intel (or x86 at least) processors in their systems and move away from ASICs.

He illustrated his point by saying some recent SPEC NFS results showed the Intel based system outperforming everything else. The results were impressive, the only flaw in them is that the costs are not disclosed for SPEC. An EMC VMAX with 96 EFDs isn't cheap. And the better your disk subsystem is the faster your front end can be.

Back when Exanet was still around they showed me some results from one of their customers testing SPEC SFS on the Exanet LSI (IBM OEM'd) back end storage vs 3PAR storage, and for the same number of disks the SPEC SFS results were twice as high on 3PAR.

But that's not my point here or question. A couple of years ago NetApp posted some pretty dismal results for the CX-3 with snapshots enabled. EMC doesn't do SPC-1 so NetApp did it for them. Interesting.

After writing up that Pillar article where I illustrated the massive efficiency gains on the 3PAR architecture(which is in part driven by their own custom ASICs), it got me thinking again, because as far as I can tell Pillar uses x86 CPUs.

Pillar offers multiple series of storage controllers to best meet the needs of your business and applications. The Axiom 600 Series 1 has dual-core processors and supports up to 24GB cache. The Axiom 600 Slammer Series 2 has quad-core processors and double the cache providing an increase in IOPS and throughput over the Slammer Series 1.

Now I can only assume they are using x86 processors, for all I know I suppose they could be using Power, or SPARC, but I doubt they are using ARM :)

Anyways back to the 3PAR architecture and their micro RAID design. I have written in the past about how you can have tens to hundreds of thousands of mini RAID arrays on a 3PAR system depending on the amount of space that you have. This is, of course to maximize distribution of data over all resources to maximize performance and predictability. When running RAID 5 or RAID 6, there are of course parity calculations involved. I can't help but wonder what sort of chances in hell a bunch of x86 CPU cores have in calculating RAID in real time for 100,000+ RAID arrays, with 3 and 4TB drives not too far out, you can take that 100,000+ and make it 500,000.

Taking the 3PAR F400 SPC-1 results as an example, here is my estimate on the number of RAID arrays on the system, fortunately it's mirrored so math is easier:

  • Usable capacity = 27,053 GB (27,702,272 MB)
  • Chunklet size = 256MB
  • Total Number of RAID-1 arrays = ~ 108,212
  • Total Number of data chunklets = ~216,424
  • Number of data chunklets per disk = ~563
  • Total data size per disk = ~144,128 MB (140.75 GB)

For legacy RAID designs it's probably not a big deal, but as disk drives grow ever bigger I have no doubt that everyone will have to move to a distributed RAID architecture, to reduce disk rebuild times and lower the chances of a double/triple disk failure wiping out your data. It is unfortunate (for them) that Hitachi could not pull that off in their latest system.

3PAR does use Intel CPUs in their systems as well, though they aren't used too heavily, on the systems I have had even at peak spindle load I never really saw CPU usage above 10%.

I think ASICs are here to stay for some time, on the low end you will be able to get by with generic CPU stuff, but on the higher end it will be worth the investment to do it in silicon.

Another place to look for generic CPUs vs ASICs is in the networking space. Network devices are still heavily dominated by ASICs because generic CPUs just can't keep up. Now of course generic CPUs are used for what I guess could be called "control data", but the heavy lifting is done by silicon. ASICs often draw a fraction of the power that generic CPUs do.

Yet another place to look for generic CPUs vs ASICs is in the HPC space - the rise of GPU-assisted HPC allowing them to scale to what was (to me anyways) unimaginable heights.

Generic CPUs are of course great to have and they have come a long way, but there is a lot of cruft in them, so it all depends on what your trying to accomplish.

The fastest NAS in the world is still BlueArc, which is powered by FPGAs, though their early cost structures put them out of reach for most folks, their new mid range looks nice, my only long time complaint about them has been their back end storage - either LSI or HDS, take it or leave it. So I leave it.

The only SPEC SFS results posted by BlueArc are for the mid range, nothing for their high end (which they tested on the last version of SFS, nothing yet for the current version).

 

27Sep/10Off

Bye Bye 3PAR, Hello HP!

TechOps Guy: Nate

Wow that was fast! HP completed it's purchase of 3PAR this morning.

HP today announced that it has completed the acquisition of 3PAR Inc., a leading global provider of utility storage, for a price of $33 per share in cash, or an enterprise value of $2.35 billion.

3PAR technologies expand HP’s storage portfolio into enterprise-class public and private cloud computing environments, which are key growth markets for HP. Complementary with HP’s current storage portfolio, 3PAR brings market-differentiating technology to HP that will enable clients to maximize storage utilization, balance workloads and automate storage tiering. This allows clients to improve productivity and more efficiently operate their storage networks.

With a worldwide sales and channel network, coupled with extensive service operations, HP is uniquely positioned to rapidly expand 3PAR’s market opportunity. As part of the HP Converged Infrastructure portfolio, which integrates servers, storage, networking and management technologies, 3PAR solutions will further strengthen HP’s ability to simplify data center environments for clients.

Further details on product integration will be announced at a later date.

Certainly not messing around!

Tagged as: , , Comments Off
26Sep/10Off

Still waiting for Xiotech..

TechOps Guy: Nate

So I was browsing the SPC-1 pages again to see if there was anything new and lo and behold, Xiotech posted some new numbers.

But once again, they appear too timid to release numbers for their 7000 series, or the 9000 series that came out somewhat recently. Instead they prefer to extrapolate performance from their individual boxes and aggregate the results. That doesn't count of course, performance can be radically different at higher scale.

Why do I mention this? Well nearly a year ago their CEO blogged, in response to one of my posts, and that was one of the first times I made news in The Register (yay! - I really was excited) , and in part the CEO said:

Responding to the Techopsguy blog view that 3PAR's T800 outperforms an Emprise 7000, the Xiotech writer claims that Xiotech has tested "a large Emprise 7000 configuration" on what seems to be the SPC-1 benchmark; "Those results are not published yet, but we can say with certainty that the results are superior to the array mentioned in the blog (3PAR T800) in several terms: $/IOP, IOPS/disk and IOPS/controller node, amongst others."

So here we are almost a year later, and more than one SPC-1 result later, and still no sign of Xiotech's SPC-1 numbers for their higher end units. I'm sorry but I can't help but feel they are hiding something.

If I were them I would put my customers more at ease by publishing said numbers, and be prepared to justify the results if they don't match up to Xiotech's extrapolated numbers from the 5000 series.

Maybe they are worried they might end up like Pillar, who's CEO was pretty happy with their SPC-1 results. Shortly afterwards the 3PAR F400 launched and absolutely destroyed the Pillar numbers from every angle. You can see more info on these results here.

At the end of the day I don't care of course, it just was a thought in my head and gave me something to write about :)

I just noticed that these past two posts puts me over the top as far as the most number of posts I have done in a month since this TechOpsGuys things started. I'm glad I have my friends Dave, Jake and Tycen generating tons of content too, after all this site was their idea!

7Sep/10Off

vSphere VAAI only in the Enterprise

TechOps Guy: Nate

Beam me up!

Damn those folks at VMware..

Anyways I was browsing around this afternoon looking around at things and while I suppose I shouldn't be I was surprised to see that the new storage VAAI APIs are only available to people running Enterprise or Enterprise Plus licensing.

I think at least the block level hardware based locking for VMFS should be available to all versions of vSphere, after all VMware is offloading the work to a 3rd party product!

VAAI certainly looks like it offers some really useful capabiltiies, from the documentation on the 3PAR VAAI plugin (which is free) here are the highlights:

  • Hardware Assisted Locking is a new VMware vSphere storage feature designed to significantly reduce impediments to VM reliability and performance by locking storage at the block level instead of the logical unit number (LUN) level, which dramatically reduces SCSI reservation contentions. This new capability enables greater VM scalability without compromising performance or reliability. In addition, with the 3PAR Gen3 ASIC, metadata comparisons are executed in silicon, further improving performance in the largest, most demanding VMware vSphere and desktop virtualization environments.
  • The 3PAR Plug-In for VAAI works with the new VMware vSphere Block Zero feature to offload large, block-level write operations of zeros from virtual servers to the InServ array, boosting efficiency during several common VMware vSphere operations— including provisioning VMs from Templates and allocating new file blocks for thin provisioned virtual disks. Adding further efficiency benefits, the 3PAR Gen3 ASIC with built-in zero-detection capability prevents the bulk zero writes from ever being written to disk, so no actual space is allocated. As a result, with the 3PAR Plug-In for VAAI and the 3PAR Gen3 ASIC, these repetitive write operations now have “zero cost” to valuable server, storage, and network resources—enabling organizations to increase both VM density and performance.
  • The 3PAR Plug-In for VAAI adds support for the new VMware vSphere Full Copy feature to dramatically improve the agility of enterprise and cloud datacenters by enabling rapid VM deployment, expedited cloning, and faster Storage vMotion operations. These administrative tasks are now performed in half the time. The 3PAR plug-in not only leverages the built-in performance and efficiency advantages of the InServ platform, but also frees up critical physical server and network resources. With the use of 3PAR Thin Persistence and the 3PAR Gen3 ASIC to remove duplicated zeroed data, data copies become more efficient as well.

Cool stuff. I'll tell you what. I really never had all that much interest in storage until I started using 3PAR about 3 and a half years ago. I mean I've spread my skills pretty broadly over the past decade, and I only have so much time to do stuff.

About five years ago some co-workers tried to get me excited about NetApp, though for some reason I never could get too excited about their stuff, sure it has tons of features which is nice, though the core architectural limitations of the platform (from a spinning rust perspective at least) I guess is what kept me away from them for the most part. If you really like NetApp, put a V-series in front of a 3PAR and watch it scream. I know of a few 3PAR/NetApp users that are outright refusing to entertain the option of running NetApp storage, they like the NAS, and keep the V-series but the back end doesn't perform.

On the topic of VMFS locking - I keep seeing folks pimping the NFS route attack the VMFS locking as if there was no locking in NFS with vSphere. I'm sure prior to block level locking the NFS file level locking (assuming it is file level) is more efficient than LUN level. Though to be honest I've never encountered issues with SCSI reservations in the past few years I've been using VMFS. Probably because of how I use it. I don't do a lot of activities that trigger reservations short of writing data.

Another graphic which I thought was kind of funny, is the current  Gartner group "magic quadrant", someone posted a link to it for VMware in a somewhat recent post, myself I don't rely on Gartner but I did find the lop sidedness of the situation for VMware quite amusing -

I've been using VMware since before 1.0, I still have my VMware 1.0.2 CD for Linux. I deployed VMware GSX to production for an e-commerce site in 2004, I've been using it for a while, I didn't start using ESX until 3.0 came out(from what I've read about the capabiltiies of previous versions I'm kinda glad I skipped them :) ). It's got to be the most solid piece of software I've ever used, besides Oracle I suppose. I mean I really, honestly can not remember it ever crashing. I'm sure it has, but it's been so rare that I have no memory of it. It's not flawless by any means, but it's solid. And VMware has done a lot to build up my loyalty to them over the past, what is it now eleven years? Like most everyone else at the time, I had no idea that we'd be doing the stuff with virtualization today that we are back then.

I've kept my eyes on other hypervisors as they come around, though even now none of the rest look very compelling. About two and a half years ago my new boss at the time was wanting to cut costs, and was trying to pressure me into trying the "free" Xen that came with CentOS at the time. He figured a hypervisor is a hypervisor. Well it's not. I refused. Eventually I left the company and my two esteemed colleges were forced into trying it after I left(hey Dave and Tycen!) they worked on it for a month before giving up and going back to VMware. What a waste of time..

I remember Tycen at about the same time being pretty excited about Hyper-V. Well at a position he recently held he got to see Hyper-V in all it's glory, and well he was happy to get out of that position and not having to use Hyper-V anymore.

Though I do think KVM has a chance, I think it's too early to use it for anything too serious at this point, though I'm sure that's not stopping tons of people from doing it anyways, just like it didn't stop me from running production on GSX way back when. But I suspect by the time vSphere 5.0 comes out, which I'm just guessing here will be in the 2012 time frame, KVM as a hypervisor will be solid enough to use in a serious capacity. VMware will of course have a massive edge on management tools and fancy add ons, but not everyone needs all that stuff (me included). I'm perfectly happy with just vSphere and vCenter (be even happier if there was a Linux version of course).

I can't help but laugh at the grand claims Red Hat is making for KVM scalability though. Sorry I just don't buy that the Linux kernel itself can reach such heights and be solid & scalable, yet alone a hypervisor running on top of Linux (and before anyone asks, NO ESX does NOT run on Linux).

I love Linux, I use it every day on my servers and my desktops and laptops, have been for more than a decade. Despite all the defectors to the Mac platform I still use Linux :) (I actually honestly tried a MacBook Pro for a couple weeks recently and just couldn't get it to a usable state).

Just because the system boots with X number of CPUs and X amount of memory doesn't mean it's going to be able to effectively scale to use it right. I'm sure Linux will get there some day, but believe it is a ways off.