TechOpsGuys.com Diggin' technology every day

November 15, 2011

LSI quietly smothers Onstor

Filed under: Storage — Nate @ 8:35 pm

About a year ago I was in the market for a new NAS box to hook up to my 3PAR T400, something to eventually replace the Exanet cluster that was hooked up to it since Exanet as a company went bust.

There wasn’t many options left, Onstor had been bought by LSI, I really couldn’t find anything on the Ibrix offering from HP at the time (at least for a SAN-attached Ibrix rather than a scale-out Ibrix), and then there was of course NetApp V-Series. I could not find any trace of Polyserve which HP acquired as well, other than something related to SQL server.

Old 3PAR graphic that I dug up from the Internet Archive

3PAR was suggesting I go with Onstor(this was, of course before the HP/Dell bidding war), claiming they still had a good relationship with Onstor through LSI. I think it was less about the partnership and more about NetApp using the V-series to get their foot in the door and then try to replace the back end disk with their own, a situation understandably 3PAR (or any other competition) doesn’t like to be in.

My VAR on the other hand had another story to tell, after trying to reach out to LSI/Onstor they determined that Onstor was basically on their death bed, with only a single reseller in the country authorized to sell the boxes, and it seemed like there was maybe a half dozen employees left working on the product.

So, I went with NetApp, then promptly left the company and left things in the hands of the rest of the team(there’s been greater than 100% turnover since I left both in the team and in the management).

One of my other friends who used to work for Exanet was suggesting to me that LSI bought Onstor with the possible intention of integrating the NAS technology into their existing storage systems, to be able to offer a converged storage option to the customers, and that the stand alone gateway would probably be going away.

Another product I had my eyes on at the time and 3PAR was working hard with to integrate was the Symantec Filestore product. I was looking forward to using it, other companies were looking to Filestore to replace their Exanet clusters as well. Though I got word through unofficial channels that Symantec planned to kill the software-only version and go the appliance route. It took longer than I was expecting but they finally did it, I was on their site recently and noticed that the only way to get it now is with integrated storage from their Chinese partner.

I kept tabs on Onstor now and then, wondering if it would come back to life in some form, the current state of the product at least from a SAN connectivity perspective seemed to be very poor – as in you couldn’t do things like live software upgrades on a cluster in most cases, the system had to have a full outage(in a lot of cases anyways). But no obvious updates ever came.

Then LSI sold their high end storage division to NetApp. I suppose that was probably the end of the road for Onstor.

So tonight, I was hitting some random sites and decided to check in on Onstor again, only to find most traces of the product erased from LSI’s site.

The only things I ever really heard about Onstor was how the likes of BlueArc and Exanet were replacing Onstor clusters, I talked to one service provider who had an Onstor system (I think connected to 3PAR too), I talked with them briefly while I was thinking about what NAS gateway to move to about a year ago and they seemed fairly content with it, no major complaints. Though it seemed like if they were to buy something new (at that time) they probably wouldn’t buy Onstor due to the uncertainty around it.

It seemed to be an interesting design – using dual processor quad core MIPS CPUs of all things.

RIP Onstor, another one bites the dust.

Hopefully LSI doesn’t do the same to 3ware. I always wondered why 3ware technology was never integrated(as far as I know anyways) into server motherboards, even after LSI acquired them, given that a lot of integrated RAID controllers (Dell etc) are LSI. I think for the most part the 3ware technology is better (if not why did they get acquired and continue to develop products?). I’ve been a 3ware user for what seems like 12 years now, and really have no complaints.

I really hope the HP X9000 NAS gateway works out, the entry level pricing for it as-is seems quite high to me though.

November 14, 2011

NetApp challenge falls without winners

Filed under: Storage — Tags: — Nate @ 9:52 am

This is just too funny.

Nobody won a million pounds of kit from NetApp because data centre nerds thought the offer was unbelievable or couldn’t be bothered with the paperwork.

NetApp UK offered an award of up to £1m in NetApp hardware, software, and services to a lucky customer that managed to reduce its storage usage by 50 per cent through using NetApp gear. The competition’s rules are still available on NetApp’s website.

For years, I have been bombarded with marketing from NetApp (indirectly from the VARs and consultants they have bribed over the years) about the efficiency of the NetApp platform, especially de-dupe, how it will save you tons of money on storage etc.

Indirectly because I absolutely refused to talk directly to the NetApp team in Seattle after how badly they treated me when I was interested in being a customer a few years ago. It seemed just as bad as EMC was (at the time). I want to be treated as a customer, not a number.

Ironically enough it was NetApp that drove me into the arms of 3PAR, long before I really understood what the 3PAR technology was all about. It was NetApp’s refusal to lend me an evaluation system for any length of time which is what sealed the deal for my first 3PAR purchase. Naturally, at the time 3PAR was falling head over heels at the opportunity to give us an eval, and so we did evaluate their product, an E200 at the time. Which in the end directly led to 4 more array purchases(with a 5th coming soon I believe), and who knows how many more indirectly as a result of my advocacy either here or as a customer reference.

Fortunately my boss at the time, was kind of like me, when the end of the road came for NetApp – and when they dropped their pants (prices) so low that most people could not ignore, my boss was to the point where NetApp could of given us the stuff for free and he would of still bought 3PAR. We asked for eval for weeks and they refused us every time until the last minute.

I have no doubt that de-dupe is effective, how effective is dependent on a large number of factors. Suffice to say I don’t buy it yet myself, at least not as a primary reason to purchase primary storage for online applications.

Anyways, I think this contest, more than anything else is a perfect example. You would think that the NetApp folks out there would of jumped on this, but there are no winners, that is too bad, but very amusing.

I can’t stop giggling.

I guess I should thank NetApp for pointing me in the right direction in the beginning of my serious foray into the storage realm.

So, thanks NetApp! 🙂

Four posts in one morning! And it’s not even 9AM yet! You’d think I have been up since 5AM writing and you’d be right.

November 8, 2011

EMC and their quad core processors

Filed under: Storage — Tags: , , , — Nate @ 8:48 am

I first heard that Fujitsu had storage maybe one and a half years ago, someone told me that Fujitsu was one company that was seriously interested in buying Exanet at the time, which caused me to go look at their storage, I had no idea they had storage systems. Even today I really never see anyone mention them anywhere, my 3PAR reps say they never encounter Fujitsu in the field(at least in these territories they suspect over in Europe they go head to head more often).

Anyways, EMC folks seem to be trying to attack the high end Fujitsu system, saying it’s not “enterprise”, in the end the main leg that EMC has trying to hold on to what in their eyes is “enterprise” is mainframe connectivity, which Fujitsu rightly tries to debunk that myth since there are a lot of organizations that are consider themselves “enterprise” that don’t have any mainframes. It’s just stupid, but EMC doesn’t really have any other excuses.

What prompted me to write this, more than anything else was this

One can scale from one to eight engines (or even beyond in a short timeframe), from 16 to 128 four-core CPUs, from two to 16 backend- and front-end directors, all with up to 16 ports.

The four core CPUs is what gets me. What a waste! I have no doubt that in EMC’s  (short time frame)  they will be migrating to quad socket 10 core CPUs right? After all, unlike someone like 3PAR who can benefit from a purpose built ASIC to accelerate their storage, EMC has to rely entirely on software. After seeing SPC-1 results for HDS’s VSP, I suspect the numbers for VMAX wouldn’t be much more impressive.

My main point is, and this just drives me mad. These big manufacturers touting the Intel CPU drum and then not exploiting the platform to it’s fullest extent. Quad core CPUs came out in 2007. When EMC released the VMAX in 2009, apparently Intel’s latest and greatest was still quad core. But here we are, practically 2012 and they’re still not onto at LEAST hex core yet? This is Intel architecture, it’s not that complicated. I’m not sure what quad core CPUs specifically are in the VMAX, but the upgrade from Xeon 5500 to Xeon 5600 for the most part was

  1. Flash bios (if needed to support new CPU)
  2. Turn box off
  3. Pull out old CPU(s)
  4. Put in new CPU(s)
  5. Turn box on
  6. Get back to work

That’s the point of using general purpose CPUs!! You don’t need to pour 3 years of R&D into something to upgrade the processor.

What I’d like to see, something I mentioned in a comment recently is a quad socket design for these storage systems. Modern CPUs have had integrated memory controllers for a long time now (well only available on Intel since the Xeon 5500). So as you add more processors you add more memory too. (Side note: the documentation for VMAX seems to imply a quad socket design for a VMAX engine but I suspect it is two dual socket systems since the Intel CPUs EMC is likely using are not quad-socket capable). This page claims the VMAX uses the ancient Intel 5400 processors, which if I remember right was the first generation quad cores I had in my HP DL380 G5s many eons ago. If true, it’s even more obsolete than I thought!

Why not 8 socket? or more? Well cost mainly. The R&D involved in an 8-socket design I believe is quite a bit higher, and the amount of physical space required is high as well. With quad socket blades common place, and even some vendors having quad socket 1U systems, the price point and physical size related to quad socket designs is well within reach of storage systems.

So the point is on these high end storage systems you start out with a single socket populated on a quad socket board with associated memory. Want to go faster? add another CPU and associated memory? Go faster still? add two more CPUs and associated memory (though I think it’s technically possible to run 3 CPUs, well there have been 3 CPU systems in the past, it seems common/standard to add them in pairs). Your spending probably at LEAST a quarter million for this system initially, probably more than that, the incremental cost of R&D to go quad socket given this is Intel after all is minimal.

Currently VMAX goes to 8 engines, they say they will expand that to more. 3PAR took the opposite approach, saying while their system is not as clustered as a VMAX is (not their words), they feel such a tightly integrated system (theirs included) becomes more vulnerable to “something bad happening” that impacts the system as a whole, more controllers is more complexity. Which makes some sense. EMC’s design is even more vulnerable being that it’s so tightly integrated with the shared memory and stuff.

3PAR V-Class Cluster Architecture with low cost high speed passive backplane with point to point connections totalling 96 Gigabytes/second of throughput

3PAR goes even further in their design to isolate things – like completely separating control cache which is used for the operating system that powers the controllers and for the control data on top of it, with the data cache, which as you can see in the diagram below is only connected to the ASICs, not to the Intel CPUs. On top of that they separate the control data flow from the regular data flow as well.

One reason I have never been a fan of “stacking” or “virtual chassis” on switches is the very same reason, I’d rather have independent components that are not tightly integrated in the event “something bad” takes down the entire “stack”. Now if your running with two independent stacks, so that one full stack can fail without an issue then that works around that issue, but most people don’t seem to do that. The chances of such a failure happening are low, but they are higher than something causing all of the switches to fail if the switches were not stacked.

One exception might be some problems related to STP which some people may feel they need when operating multiple switches. I’ll answer that by saying I haven’t used STP in more than 8 years, so there have been ways to build a network with lots of devices without using STP for a very long time now. The networking industry recently has made it sound like this is something new.

Same with storage.

So back to 3PAR. 3PAR changed their approach in their V-series of arrays, for the first time in the company’s history they decided to include TWO ASICs in each controller, effectively doubling the I/O processing abilities of the controller. Fewer, more powerful controllers. A 4-node V400 will likely outperform an 8-node T800. Given the system’s age, I suspect a 2-node V400 would probably be on par with an 8-node S800 (released around 2003 if I remember right).

3PAR V-Series ASIC/CPU/PCI/Memory Architecture

EMC is not alone, and not the worst abuser here though. I can cut them maybe a LITTLE slack given the VMAX was released in 2009. I can’t cut any slack to NetApp though. They recently released some new SPEC SFS results, which among other things, disclosed that their high end 6240 storage system is using quad core Intel E5540 processors. So basically a dual proc quad core system. And their lower end system is — wait for it — dual proc dual core.

Oh I can’t describe how frustrated that makes me, these companies touting using general purpose CPUs and then going out of their way to cripple their systems. It would cost NetApp all of maybe what $1200 to upgrade their low end box to quad cores? Maybe $2500 for both controllers? But no they rather you spend an extra, what $50,000-$100,000  to get that functionality?

I have to knock NetApp more to some extent since these storage systems are significantly newer than the VMAX, but I knock them less because they don’t champion the Intel CPUs as much as EMC does, that I have seen at least.

3PAR is not a golden child either, their latest V800 storage system uses — wait for it — quad core processors as well. Which is just as disgraceful. I can cut 3PAR more slack because their ASIC is what provides the horsepower on their boxes, not the Intel processors, but still that is no excuse for not using at LEAST 6 core processors. While I cannot determine precisely which Intel CPUs 3PAR is using, I know they are not using Intel CPUs because they are ultra low power since the clock speeds are 2.8Ghz.

Storage companies aren’t alone here, load balancing companies like F5 Networks and Citrix do the same thing. Citrix is better than F5 in their offering software “upgrades” on their platform that unlock additional throughput. Without the upgrade you have full reign of all of the CPU cores on the box which allow you to run more expensive software features that would normally otherwise impact CPU performance. To do this on F5 you have to buy the next bigger box.

Back to Fujitsu storage for a moment, their high end box certainly seems like a very respectable system with regards to paper numbers anyways. I found it very interesting the comment on the original article that mentioned Fujitsu can run the system’s maximum capacity behind a single pair of controllers if the customer wanted to, of course the controllers couldn’t drive all the I/O but it is nice to see the capacity not so tightly integrated to the controllers like it is on the VMAX or even on the 3PAR platform. Especially when it comes to SATA drives which aren’t known for high amounts of I/O, higher end storage systems such as the recently mentioned HDS, 3PAR and even VMAX tap out in “maximum capacity” long before they tap out in I/O if your loading the system with tons of SATA disks. It looks like Fujitsu can get up to 4.2PB of space leaving, again HDS, 3PAR and EMC in the dust. (Capacity utilization is another story of course).

With Fujitsu’s ability to scale the DX8700 to 8 controllers, 128 fibre channel interfaces, 2,700 drives and 512GB of cache that is quite a force to be reckoned with. No sub-disk distributed RAID, no ASIC acceleration, but I can certainly see how someone would be willing to put the DX8700 up against a VMAX.

EMC was way late to the 2+ controller hybrid modular/purpose built game and is still playing catch up. As I said to Dell last year, put your money where your mouth is and publish SPC-1 results for your VMAX, EMC.

With EMC so in love with Intel I have to wonder how hard they had to fight off Intel from encouraging EMC to use the Itanium processor in their arrays instead of Xeons. Or has Intel given up completely on Itanium now (which, again we have to thank AMD for – without AMD’s x86-64 extensions the Xeon processor line would of been dead and buried many years ago).

For insight to what a 128-CPU core Intel-based storage system may perform in SPC-1, you can look to this system from China.

(I added a couple diagrams, I don’t have enough graphics on this site)

November 4, 2011

Hitachi VSP SPC-1 Results posted

Filed under: Storage — Tags: , , , , — Nate @ 7:41 pm

[(More) Minor updates since original post] I don’t think I’ll ever work for a company that is in a position to need to leverage something like a VSP, but that doesn’t stop me from being interested as to how it performs. After all it has been almost four years since Hitachi released results for their USP-V, which had, at the time very impressive numbers(record breaking even, only to be dethroned by the 3PAR T800 in 2008) from a performance perspective(not so much from a cost perspective not surprisingly).

So naturally, ever since Hitachi released their VSP (which is OEM’d by HP as their P9500)about a year ago I have been very curious as to how well it performs, it certainly sounded like a very impressive system on paper with regards to performance. After all it can scale to 2,000 disks and a full terabyte of cache. I read an interesting white paper (I guess you could call it that) recently on the VSP out of curiosity.

One of the things that sort of stood out is the claim of being purpose built, they’re still using a lot of custom components and monolithic architecture on the system, vs most of the rest of the industry which is trying to be more of a hybrid of modular commodity and purpose built, EMC VMAX is a good example of this trend. The white paper, which was released about a year ago, even notes

There are many storage subsystems on the market but only few can be considered as real hi-end or tier-1. EMC announced its VMAX clustered subsystem based on clustered servers in April 2009, but it is often still offering its predecessor, the DMX, as the first choice. It is rare in the industry that a company does not start ramping the new product approximately half a year after launching. Why EMC is not pushing its newer product more than a year after its announcement remains a mystery. The VMAX scales up by adding disks (if there are empty slots in a module) or adding modules, the latter of which are significantly more expensive. EMC does not participate in SPC benchmarks.

The other interesting aspect of the VSP is it’s dual chassis design(each chassis in a separate rack), which, the white paper says is not a cluster, unlike a VMAX or a 3PAR system(3PAR isn’t even a cluster the way VMAX is a cluster intentionally with the belief the more isolated design would lead to higher availability). I would assume this is in response to earlier criticisms of the USP-V design in which the virtualization layer was not redundant(when I learned that I was honestly pretty shocked) – Hitachi later rectified the situation on the USP-V by adding some magic sauce that allowed you to link a pair of them together.

Anyways with all this fancy stuff, obviously I was pretty interested when I noticed they had released SPC-1 numbers for their VSP recently. Surely, customers don’t go out and buy a VSP because of the raw performance, but because they may need to leverage the virtualization layers to abstract other arrays, or perhaps the mainframe connectivity, or maybe they get a comfortable feeling knowing that Hitachi has a guarantee on the array where they will compensate you in the event there is data loss that is found to be the fault of the array (I believe the guarantee only covers data loss and not downtime, but am not completely sure). After all, it’s the only storage system on the planet that has such a stamp of approval from the manufacturer (earlier high end HDS systems had the same guarantee).

Whatever the reason,  performance is still a factor given the amount of cache and the sheer number of drives the system supports.

One thing that is probably never a factor is ease of use – since the USP/VSP are complex beasts to manage, something your very likely need significant amounts of training. One story I remember being told is a local HDS rep in the Seattle area mentioned to a 3PAR customer “after a few short weeks of training you’ll feel as comfortable on our platform as you do on 3PAR”. Something like that, the customer said “you just made my point for me”. Something like that anyways.

Would the VSP dethrone the HP P10000 ? I found the timing of the release of the numbers an interesting coincidence after all, I mean the VSP is over a year old at this point, why release now ?

So I opened the PDF, and hesitated .. what would be the results? I do love 3PAR stuff and I still view them as  somewhat of an underdog even though they are under the HP umbrella.

Wow, was I surprised at the results.

Not. Even. Close.

The VSP’s performance was not much better than that of the USP-V which was released more than four years ago. The performance itself is not bad but it really puts the 3PAR V800 performance in some perspective:

  • VSP -   ~270,000 IOPS @ $8.18/IOP
  • V800 – ~450,000 IOPS @ $6.59/IOP

But it doesn’t stop there

  • VSP -  ~$45,600 per usable TB (39% unused storage ratio)
  • V800 – ~$13,200 per usable TB (14% unused storage ratio)

Hitachi managed to squeeze in just below the limit for unused storage – which is not allowed to be above 45%, sounds kind of familiar. The VSP had only 17TB additional usable capacity than the original USP-V as tested. This really shows how revolutionary sub disk distributed RAID is for getting higher capacity utilization out of your system, and why I was quite disappointed when the VSP came out without anything resembling it.

Hitachi managed to improve their cost per IOP on the VSP vs the USP but their cost per usable TB has skyrocketed, about triple the price of the original USP-V results. I wonder why this is ? (I mis counted the digits in my original calculations, in fact the VSP is significantly cheaper than the USP when it was tested!) One big change in the VSP is the move to 2.5″ disk drives. The decision to use 2.5″ drives did kind of hamper results I believe since this is not a SPC-1/E Energy Efficiency test so power usage was never touted. But the largest 2.5″ 15k RPM drive that is available for this platform is 146GB(which is what was used).

One of the customers(I presume) at the HP Storage presentation I was at recently was kind of dogging 3PAR saying the VSP needs less power per rack than the T or V class(which requires 4x208V 30A per fully loaded rack).

I presume it needs less power because of the 2.5″ drives, also overall the drive chassis on 3PAR boxes do draw quite a bit of power by themselves(200W empty on T/V, 150W empty on F – given the T/V hold 250% more drives/chassis it’s a pretty nice upgrade).

Though to me especially on a system like this, power usage is the last thing I would care about. The savings from disk space would pay for the next 10 years of power on the V800 vs the VSP.

My last big 3PAR array cut the number of spindles in half and the power usage in half vs the array it replaced, while giving us roughly 50% better performance at the same time and the same amount of usable storage. So knowing that, and the efficiency in general I’m much less concerned as to how much power a rack requires.

So Hitachi can get 384 x 2.5″ 15k RPM disks in a single rack, and draw less power than 3PAR can get with 320 disks in a single rack.

You could also think of it this way: Hitachi can get ~56TB RAW of 15k RPM space in a single rack, and 3PAR can get 192TB RAW of 15k RPM space in a single rack, nearly four times the RAW space for double the power(nearly five times the usable(4.72x precisely – V800 @ 80% utilization VSP @ 58%) space due to the architecture), for me that’s a good trade off, a no brainer really.

The things I am not clear on with regards to these results – is this the best the VSP can do?  This does appear to be a dual chassis system. The VSP supports 2,000 spindles, the system tested only had 1,152, which is not much more than the 1,024 tested on the original USP-V. Also the VSP supports up to 1TB of cache however the system tested only had 512GB (3PAR V800 had 512GB of data cache too).

Maybe it is one of those situations where this is the most optimal bang for the buck on the platform,  perhaps the controllers just don’t have enough horsepower to push 2,000 spindles at full tilt – I don’t know.

Hitachi may have purpose built hardware, but it doesn’t seem to do them a whole lot of good when it comes to raw performance. I don’t know about you but I’d honestly feel let down if I was an HDS customer. Where is the incentive to upgrade from USP-V  to VSP ? The cost for usable storage is far higher, the performance is not much better. Cost per IOP is less, but I suspect the USP-V at this stage of the game with more current pricing for disks and stuff would be even more competitive than VSP. (correction from above, due to my mis-reading of the digits ($135k/TB instead of $13.5k/TB VSP is in fact cheaper making it a good upgrade from USP from that perspective! Sorry for the error 🙂 )

Maybe it’s in the software – I’ve of course never used a USP/VSP system but perhaps the VSP has newer software that is somehow not backwards compatible with the USP, though I would not expect this to be the case.

Complexity is still high, the configuration of the array stretches from page 67 of the full disclosure documentation to page 100. 3PAR’s configuration by contrast is roughly a single page.

Why do you think someone would want to upgrade from a USP-V to a VSP?

I still look forward to the day when the likes of EMC make their VMAX architecture across their entire product line, and HDS also unifies their architecture. I don’t know if it’ll ever happen but it should be possible at least with VMAX given their sound thumping of the Intel CPU drum set.

The HDS AMS 2000 series of products was by no means compelling either, when you could get higher availability(by means of persistent cache with 4 controllers), three times the amount of usable storage, and about the same performance on a 3PAR F400 for about the same amount of money.

Come on EMC, show us some VMAX SPC-1 love, you are, after all a member of SPC now (though kind of odd your logo doesn’t show up, just the text of the company name).

One thing that has me wondering on this VSP configuration – with such little disk space available on the system I have to wonder why anyone would bother with such a configuration with spinning rust on this platform for performance and just go SSD instead. Well one reason may be a 400GB SSD would run $35-40,000 (after 50% discount, assuming that is list price), ouch. 220 of those (system maximum is 256) would net you roughly 88TB raw (maybe 40TB usable), but cost $7.7M for the SSDs alone.

On the V800 side a 4-pack of 200GB SSDs will cost you roughly $25,000 after discount(assuming that is list price). Fully loading a V800 with the maximum of 384 SSDs(77TB raw) would cost roughly $2.4M for the SSDs alone(and still consume 7 racks) I think I like the 3PAR SSD strategy more  — not that I would ever do such a configuration!

Space would be more of course if the SSD tests used RAID 5 instead of RAID 1, I just used RAID 1 for simplicity.

Goes to show that these big boxes have a long way to go before they can truly leverage the raw performance of SSD. with 200-300 SSDs in either of these boxes the controllers would be maxed out and the SSDs would probably for the most part be idle. I’ve been saying for a while how stupid it seems for such a big storage system to be overwhelmed so completely by SSDs and that the storage systems need to be an order of magnitude(or more) faster. 3PAR did a good start with doubling of performance, but I’m thinking the boxes need to do at least 1M IOPS to disk, if not 2M, for systems such as these — how long will it take to get there?

Maybe HDS could show better results with 900GB 10k RPM disks(and put something like 1700 disks instead of 1,152 instead of the 146GB 15k RPM disks, should hopefully show much lower per usable TB costs, though their IOPS cost would probably shoot up. Though from what I can see, 600GB is the max supported 2.5″ 10k RPM disk supported. 900GB @ 58% utilization would yield about 522GB, and 600GB on 3PAR @ 80% utilization would yield about 480GB.

I suspect they could get their costs down significantly more if they went old school and supported larger 3.5″ 15k RPM disks and used those instead(the VSP supports 3.5″ disks just nothing other than 2TB Nearline (7200 RPM) and 400GB Flash – a curious decision). If you were a customer isn’t that something you’d be interested in ? Though another compromise you make with 3.5″ disks on the VSP is your then limited to 1,280 spindles, rather than 2,048 with 2.5″. Though this could be a side effect of a maximum addressable capacity of 2.5PB which is reached with 1,280 2TB disks, they very well could probably support 2,048 3.5″ disks if they could fit in the 2.5PB limit.

Their documentation says actual maximum usable with Nearline 2TB disks is 1.256PB (with RAID 10). With 58% capacity utilization that 1.25PB drops dramatically to 742TB. With RAID 5 (7+1) they say 2.19PB usable, let’s take that 58% number again 1.2PB @ 58% capacity utilization.

V800 by contrast would top out at 640TB with RAID 10 @ 80% utilization (stupid raw capacity limits!! *shakes fist at sky*),  or somewhere in the 880TB-1.04PB (@80%) range with RAID 5 (depending on data/parity ratio).

Another strange decision on both HDS and 3PAR’s part was to only certify the 2TB disk on their platform, and not anything smaller. Since both systems become tapped out at roughly half the number of supported spindles when using 2TB disks due to capacity limits.

An even more creative solution(to work around the limitation!!), which I doubt is possible is somehow restrict the addressable capacity of each disk to 1/2 the size, so you could in effect get 1,600 2TB disks in each with 1TB of capacity, then when they release software updates to scale the box to even higher levels of capacity (32GB of control cache can EASILY handle more capacity) just unlock the drives and get that extra capacity free. That would be nice at least, probably won’t happen though. I bet it’s technically possible on the 3PAR platform due to their chunklets. 3PAR in general doesn’t seem as interested in creative solutions as I am 🙂 (I’m sure HDS is even more rigid)

Bottom Line

So HDS has managed to get their cost for usable space down quite a bit, from $135,000/TB to around $45,000/TB, and improve performance by about 25%

They still have a long ways to go in the areas of efficiency, performance, scalability, simplicity and unified architecture. I’m sure there will be plenty of customers out there that don’t care about those things(or are just not completely informed) and will continue to buy VSP for other reasons, it’s still a solid platform if your willing to make those trade offs.

October 22, 2011

IBM posts XIV SPC-2 results

Filed under: Storage — Tags: , , — Nate @ 8:35 pm

[UPDATED – as usual I re-read my posts probably 30 times after I post them and refine them a bit if needed, this one got quite a few changes. I don’t run a newspaper here so I don’t aim to have a completely thought out article when I hit post for the first time]

IBM finally came out and posted some SPC-2 results for their XIV platform, which is better than nothing but unfortunately they did not post SPC-1 results.

SPC-2 is a sequential throughput test, more geared towards things like streaming media and data warehousing instead of random I/O which represents a more typical workload.

The numbers are certainly very impressive though, coming in at 7.3 gigabytes/second, besting most other systems out there, 42 megabytes/second per disk, IBM’s earlier high end storage array was only able to inch out 12 megabytes/second per disk(with 4 times the number of disks) with disks that were twice as fast. So at least 8 times the I/O capacity, for only about 25% more performance vs XIV, that’s a stark contrast!

SATA/Nearline/7200RPM SAS disks are typically viewed as good at sequential operations, though I would expect 15k RPM disks to do at least as well, since the faster RPM should result in more data traveling under the head at a faster rate, perhaps a sign of a good architecture in XIV with it’s distributed mirrored RAID.

While the results are quite good – again it doesn’t represent the most common types of workloads out there which is random I/O.

The $1.1M discounted price of the system seems quite high for something that only has 180 disks on it(discounts on the system seem to for the most part be 70%), though there is more than 300 gigabytes of cache. I bought a 2-node 3PAR T400 with 200 SATA disks shortly after the T was released in 2008 for significantly less, of course it only had 24GB of data cache!

I hope the $300 modem IBM(after 70% discount) is using is a USR Courier! (Your Price: $264.99still leaves a good profit for IBM). Such fond memories of the Courier.

I can only assume at this point of time IBM has refrained from posting SPC-1 results is because with a SATA-only system the results would not be impressive. In a fantasy world with nearline disks and a massive 300GB cache maybe they could achieve 200-250 IOPS/disk which would put the $1.1M system with 180 disks 36,000 – 45,000 SPC-1 IOPS, or $24-30/IOP.

A more realistic number is probably going to be 25,000 or less($44/IOP), making it one of the most expensive systems out there for I/O (even if it could score 45,000 SPC-1). 3PAR would do 14,000 IOPS (not SPC-1 IOPS mind you, SPC-1 number would probably be lower) with 180 SATA disks and RAID 10 by contrast, based on their I/O calculator with 80% read/20% write workload for about 50% less cost(after discounts) for a 4-node F400.

One of the weak spots on 3PAR is the addressable capacity per controller pair, for I/O and disk connectivity purposes a 2-node F200 (much cheaper) could easily handle 180 2TB SATA disks, but from a software perspective that is not the case. I have been complaining about this for more than 3 years now, they’ve finally addressed it to some extent in the V-class but I am still disappointed to the extent it has been addressed per the supported limits(1.6PB, should be more than double that) that exist today, but at least with the V they have enough memory on the box to scale it up with software upgrades(time will tell if such upgrades come about however).

I would not even use a F400 for this if it was me opting instead for a T800 (800TB) or a V class(800-1600TB), because with 360TB raw on the system that is very close to the limit of the F400’s addressable capacity (384TB), or the T400(400TB). You could of course get a 4-node T800(or a 2-node V400 or V800)  to start, then add additional controllers to get beyond 400TB of capacity if/when the need arises. With the 4-controller design you also get the wonderful persistent cache feature built in (one of the rare software features that is not separately licensed).

But for this case, comparing a nearly maxed out F400 against a maxed out XIV is still fair – it is one of the main reasons I did not consider XIV during my last couple storage purchases.

So there is a strong use case of when to use XIV with these results – throughput oriented workloads! The XIV would absolutely destroy the F400 in throughput, which tops out at 2.6GB/sec (to disk).

With software such as Vertica out there which slashes the need for disk I/O on data warehouses given it’s advanced design, and systems such as Isilon being so geared towards things like scale out media serving (using NFS for media serving seems like a more ideal protocol anyways), I can’t help but wonder what XIV’s place is in the market, at this price point at least. It does seem like a very nice platform from a software perspective, and with their recent switch to Infiniband from 1 Gigabit ethernet a good part of their hardware has been improved as well, also it has SSD read cache coming.

I will say though that this XIV system will handily beat even a high end 3PAR T800 for throughput. While 3PAR has never released SPC-2 numbers the T800 tops out at a 6.4 gigabytes/second(from disk), and it’s quite likely it’s SPC-2 results would be lower than that.

With the 3PAR architecture being as optimized as it is for random I/O I do believe it would suffer vs other platforms with sequential I/O. Not that the 3PAR would run slow, but it would quite likely run slower due to how data is distributed on the system. That is just speculation though a result of not having real numbers to base it on. My own production random I/O workloads in the past have had 15k RPM disks running in the range of 3-4MB/second(numbers are extrapolated as I have only had SATA and 10k RPM disks in my 3PAR arrays to-date though my new one that is coming is 15k RPM), as such with a random I/O workload you can scale up pretty high before you run into any throughput limits on the system (in fact if you max out a T800 with 1,280 drives you could do as high as 5MB/second/disk before you would hit the limit). Though XIV is distributed RAID too so who knows..

Likewise I suspect 3PAR/HP have not released SPC-2 numbers because it would not reflect their system in the most positive light, unlike SPC-1.

Sorry for the tangents on 3PAR  🙂

October 19, 2011

HP Storage strategy – some hits, some misses

Filed under: Random Thought,Storage — Tags: , — Nate @ 9:24 pm

[UPDATED – Some minor updates since my original post] I was at an Executive Briefing by HP, given by HP’s head of storage, and former 3PAR CEO David Scott. I suppose this is one good reason to be in the Bay Area – events like this didn’t really happen in Seattle.

I really didn’t know what to expect, I was looking forward to seeing David speak as I had not heard him before, his accent, oddly enough surprised me.

He covered a lot of various topics, it was clear of course he was more passionate about 3PAR than Lefthand, or Ibrix or XP or EVA, not surprising.

The meeting was not technical enough to get any of my previously mentioned questions answered, seemed very geared towards the PhB crowd.

HP Storage hits

3PAR

One word: Duh. The crown jewel of the HP storage strategy.

He emphasized over and over the 3PAR architecture and how it’s the platform powering 7 of the top 10 clouds out there, the design that lets them handle unpredictable workloads, you know this stuff by now, I don’t need to repeat it.

David pointed out an interesting tidbit with regards to their latest SPC-1 announcement, he compared the I/O performance and cost per usable TB of the V800 to a Texas Memory Systems all-flash array that was tested earlier this year.

The V800 outperformed the TMS system by 50,000 IOPS, and came in at a cost per usable TB of only about $13,000 vs $50,000/TB for the TMS box.

Cost per I/O, which he did not mention certainly favored the TMS system($1.05), but the comparison was still a good one I thought – we can give you the performance of flash and still give you a metric ton of disk space at the same time. Well if you want to get technical I guesstimate the fully loaded V800 weighs in at 13,160 pounds or about 6 metric tons.

Of course flash certainly has it’s use cases, if you don’t have a lot of data it doesn’t make sense to invest in 2,000 spinning rust buckets to get to 450,000 IOPS.

Peer Motion

Peer motion – both a hit and a miss, a hit because it’s a really neat technology, the ability to non disruptively migrate between storage systems without 3rd party appliances, the miss, well I’ll talk about that below.

He compared peer motion to the likes of Hitachi‘s USP/VSP, IBM‘s SVC, and EMC‘s VPLEX, which are all expensive, complicated bolt-on solutions. Seemed reasonable.

Lefthand VSA

It’s a good concept, and it’s nice to see it as a supported platform. David mentioned that Vmware themselves originally tried to acquire Lefthand (or wanted to acquire I don’t know if anything official was made) because they saw the value in the technology – and of course recently Vmware introduced something kinda-sorta-similar to the Lefthand VSA in vSphere 5. Though it seems not quite as flexible or as scalable.

I’m not sure I see much value in the P4000 appliances by contrast, I hear that doing RAID 5 or worse yet RAID 6 on P4000 is just asking for a whole lotta pain.

StoreOnce De-duplication

It sounds like it has a lot of promise, I’ll put it in the hit column for now as it’s still a young technology and it’ll take time to see where it goes. But the basic concept is a single de-duplication technology for all of your data. He contrasted this with EMC’s strategy for example where they have client side de-dupe with their software backup products, in line de-dupe with data domain, and primary storage dedupe — none of which are compatible with each other. Who knows, by the time HP gets it right with StoreOnce maybe EMC and others will get it right too.

I’m still not sold myself on the advantages of dedupe outside of things like backups, and things like VDI. I’ve sat through what seems like a dozen NetApp presentations on the topic so I have had the marketing shoved down my neck many times. I came to this realization a few years ago during an eval test of some data domain gear, I’ll be honest and admit I did not fully comprehend the technicals behind de-duplication at the time and I expected pretty good results from feeding it tens of gigabytes of uncompressed text data. But turns out I was wrong and realized why I was under an incorrect assumption to begin with.

Now data compression on the other hand is another animal entirely, being able to support in line data compression without suffering much or any I/O hit really would be nice to see (I am aware there are one/more vendors out there that offer this technology now).

HP Storage Misses

Nobody is perfect, HP and 3PAR are no exception no matter how much I may sing praises for them here.

Peer Motion

When I first heard about this technology being available on both the P4000 and 3PAR platforms I assumed that it was compatible with each other, meaning you could peer motion data to/from P4000 and 3PAR. One of my friends at 3PAR clarified this was not the case with me a few weeks ago and David Scott mentioned that again today.

He tried to justify it comparing it to vSphere vMotion where you can’t do a vMotion between a vSphere system and a Hyper-V system. He could of gone deeper and said you can’t do vMotion even between vSphere hosts if the CPUs are not compatible, would of been a better example.

So he said that most federation technologies are usually homogeneous in nature, and you should not expect to be able to peer motion from a HP P4000 to a HP 3PAR system.

Where HP’s argument kind of falls apart here is that the bolt on solutions he referred to as inferior previously do have the ability to migrate data between systems that are not the same. It may be ugly, it may be kludgey, but it can work. Hitachi even lists 3PAR F400, S400 and T800 as supported platforms behind the USP. IBM lists 3PAR and HP storage behind their SVC.

So, what I want from HP is the ability to do peer motion between at least all of their higher end storage platforms (I can understand if they never have peer motion on the P2000/MSA since it’s just a low end box). I’m not willing to accept any excuses, other than “sorry, we can’t do it because it’s too complicated”. Don’t tell me I shouldn’t expect to have it, I fully expect to have it.

Just another random thought but when I think of storage federation, and homogeneous I can’t help but think of this scene from Star trek VI

GORKON: I offer a toast. …The undiscovered country, …the future.
ALL: The undiscovered country.
SPOCK: Hamlet, act three, scene one.
GORKON: You have not experienced Shakespeare until you have read him in the original Klingon.
CHANG: (in Klingonese) ‘To be or not to be.’
KERLA: Captain Kirk, I thought Romulan ale was illegal.
KIRK: One of the advantages of being a thousand light years from Federation headquarters.
McCOY: To you, Chancellor Gorkon, one of the architects of our future.
ALL: Chancellor!
SCOTT: Perhaps we are looking at something of that future here.
CHANG: Tell me, Captain Kirk, would you be willing to give up Starfleet?
SPOCK: I believe the Captain feels that Starfleet’s mission has always been one of peace.
CHANG: Ah.
KIRK: Far be it for me to dispute my first officer. Starfleet has always been…
CHANG: Come now, Captain, there’s no need to mince words. In space, all warriors are cold warriors.
UHURA: Er. General, are you fond of …Shakes ….peare?
CHEKOV: We do believe all planets have a sovereign claim to inalienable human rights.
AZETBUR: Inalien… If only you could hear yourselves? ‘Human rights.’ Why the very name is racist. The Federation is no more than a ‘homo sapiens’ only club.
CHANG: Present company excepted, of course.
KERLA: In any case, we know where this is leading. The annihilation of our culture.
McCOY: That’s not true!
KERLA: No!
McCOY: No!
CHANG: ‘To be, or not to be!’, that is the question which preoccupies our people, Captain Kirk. …We need breathing room.
KIRK: Earth, Hitler, nineteen thirty-eight.
CHANG: I beg your pardon?
GORKON: Well, …I see we have a long way to go.

For the most basic workloads it’s not such a big deal if you have vSphere and storage vMotion (or some other similar technology). You cannot fully compare storage vMotion with peer motion but for offering the basic ability to move data live between different storage platforms it does (mostly) work.

HP Scale-out NAS (X9000)

I want this to be successful, I really do. Because I like to use 3PAR disks and well there just aren’t many NAS options out there these days that are compatible. I’m not a big fan of NetApp, I very reluctantly bought a V3160 cluster to try to replace an Exanet cluster on my last 3PAR box because well Exanet kicked the bucket(not the product we had installed but the company itself). I left the company not long after that, and barely a year later the company is already going to abandon NetApp and go with the X9000 (of all things!).  Meanwhile their unsupported Exanet cluster keeps chugging along.

Back to X9000. It sounds like a halfway decent product they say the main thing they lacked was snapshot support and that is there now(or will be soon), kind of strange Ibrix has been around for how long and they did not have file system snapshots till now? I really have not heard much about Ibrix from anyone other than HP whom obviously sings the praises for the product.

I am still somewhat bitter for 3PAR not buying Exanet when they had the chance, Exanet is a far better technology than Ibrix. Exanet was sold for, if I remember right $12 million, a drop in the bucket. Exanet had deals on the table(at the time) that would of brought in more than $12 million in revenue (in each deal) alone. Multi petabyte deals. Here is the Exanet Architecture (and file system), as it stood in 2005, in my opinion, very similar to the 3PAR approach(completely distributed, clustered design – Exanet treats files like 3PAR treats chunklets), except Exanet did not have any special ASICs, everything was done in software. Exanet had more work to do on their product it was far from perfect but it had a pretty solid base to build upon.

So, given that I do hope X9000 does well, I mean my needs are not that great,. what I’d really like to see is a low end VSA for the X9000 along the lines of their P4000 iSCSI VSA. Just for simple file storage in an HA fashion. I don’t need to push 30 gigabits/second, just want something that is HA, has decent performance and is easy to manage.

Legacy storage systems (EVA especially)

Let it die already, HP has been adamant they will continue to support and develop the EVA platform for their legacy customers. That sort of boggles my mind. Why waste money on that dead end platform. Use the money to give discounted incentives to upgrade to 3PAR when the time comes. I can understand supporting existing installs, bug fix upgrades, but don’t waste money on bringing out whole new revisions of hardware and software to this dead end product. David said so himself – supporting the install base of EVA and XP is supporting their 11% market share, the other 89% of the market that they don’t have they are going to push 3PAR/Lefthand/Ibrix.

I would find a way to craft a message to that 11% install base, subliminal messaging (ala Max Headroom, not sure why that came to my head) make them want to upgrade to a 3PAR box, not the next EVA system.

XP/P9500 I can kinda sorta see keeping around, I mean there are some things it is good at that even 3PAR can’t do today. But the market for such things is really tiny, and shrinking all the time. Maybe HP doesn’t put much effort into this platform because it is OEM’d from Hitachi, in which case it doesn’t cost a lot to re-sell, in which case it doesn’t make a big difference if they keep selling it or stopped selling it.  I don’t know.

I can just see what goes through a typical 3PAR SE’s mind (at least those that were present before HP acquired 3PAR) when they are faced with selling an EVA. If the deal closes perhaps they scream NOooooooooooooooooooo like Darth Vader in Return of the Jedi. Sure they’d rather have the customer buy HP then go buy a Clariion or something. But throw these guys a bone. Kill EVA and use the money to discount 3PAR more in the marketplace.

P2000/MSA – gotta keep that stuff, there will probably always be some market for DAS

Insights

I had the opportunity to ask some high level questions of David and got some interesting responses, he seems like a cool guy

3PAR Competition

David harped a lot on how other storage architectures from the big manufacturers were designed 15-20 years ago. I asked him – why does he think – given 3PAR technology is 10+ years old at this point that these other manufacturers haven’t copied it to some degree? It has obvious technological advantages it just baffles me why others haven’t copied it.

His answer came down to a couple of things. The main point was 3PAR was basically just lucky. They were in the right place, at the right time, with the right product. They successfully navigated the tech recession when at least two other utility storage startups could not and folded (I forgot their names, I’m terrible with names). He said the big companies pulled back on R&D spending as a result of the recession and as such didn’t innovate as much in this area, which left a window of opportunity for 3PAR.

He also mentioned two other companies that were founded at about the same time to address the same market – utility computing. He mentioned Vmware as one of them, the other was the inventor of the first blade system, forgot the name. Vmware I think I have to dispute though. I seem to recall Vmware “stumbling” into the server market on accident rather than targeting it directly. I mean I remember using Vmware before it was even Vmware workstation or GSX. It was just a program used to run another OS on top of Linux (that was the only OS Vmware ran on at the time). I recall reading that the whole server virtualization movement came way later and caught Vmware off guard. as much as it caught everyone else off guard.

He also gave an example in EMC and their VMAX product line. He said that EMC mis understood what the 3PAR architecture was about – in that they thought it was just a basic cluster design, so EMC re-worked their system to be a cluster – the result is VMAX. But it still falls short in several design aspects, EMC wasn’t paying attention.

I was kind of underwhelmed when the VMAX was announced, I mean sure it is big, and bad, and expensive, but they didn’t seem to do anything really revolutionary in it. Same goes for the Hitachi VSP. I fully expected both to do at least some sort of sub disk distributed RAID. But they didn’t.

Utilizing flash as a real time cache

David harped a lot on 3PAR’s ability to be able to respond to unpredictable workloads. This is true, I’ve seen it time and time again, it’s one reason why I really don’t want to use any other storage platform at this point in time given the opportunity.

Something I thought really was innovative that came out of EMC in the past year or two is their Flash Cache product (I think that’s the right name), the ability to use high speed flash as both a read and a write cache. The ability to bulk the cache levels up into the multiples of terabytes for big cloud operations.

His response was – we already do that – with RAM cache. I clarified a bit more in saying scaling out the cache even more with flash well beyond what you can do with RAM. He kind of ducked the question saying it was a bit too technical/architectural for the crowd in the room. 3PAR needs to have this technology. My key point to him is the 3PAR tools like Adaptive Optimization and Dynamic Optimization are great tools – but they are not real time. I want something that is real time. It seemed he acknowledged that point – the lack of the real time nature of the existing technologies as a weak point – hopefully HP/3PAR addresses it soon in some form.

In my previous post, Gabriel commented on how the new next gen IBM XIV will be able to have up to 7.5TB of read cache via SSD. I know NetApp can have a couple TB worth of read cache in their higher end boxes. As far as I know only EMC has the technology to do both read and write. I can’t say how well it works, since I’ve never used it and know nobody that has this EMC gear, but it is a good technology to have, especially as flash matures more.

I just think how neat it would be to have, say a 1.5-2PB system running SATA disks with an extra 100TB(2.5-5% of total storage) of flash cache on top of it.

Bringing storage intelligence to the application layer

Another question I asked him was his thoughts around a broader industry trend which seems to be trying to bring the intelligence of storage out of the storage system and put it higher up in the stack – given the advanced functionality of a 3PAR system are they threatened at all by this? The examples I gave were Exchange and Oracle ASM.

He focused on Oracle, mentioning that Oracle was one of the original investors in 3PAR and as a result there was a lot of close collaboration between the two companies, including the development of the ASM technology itself.

He mentioned one of the VPs of Oracle, I forget if he was a key ASM architect or developer or something, but someone high up in the storage strategy involving ASM — in the early days this guy was very gung ho, absolutely convinced that running the world on DAS with ASM was the way to go. Don’t waste your money on enterprise storage, we can do it higher in the stack and you can use the cheap storage, save yourself a lot of money.

David said once this Oracle guy saw 3PAR storage powering an Oracle system he changed his mind, he no longer believed that DAS was the future.

The key point David was trying to make was – bringing storage intelligence higher up in the stack is OK if  your underlying storage sucks. But if you have a good storage system, you can’t really match that functionality/performance/etc that high up in the stack and it’s not worth considering.

Whether he is right or not is another question, for me I think it depends on the situation, but any chance I get I will of course lean towards 3PAR for my back end disk needs rather than use DAS.

In short – he does not feel threatened at all by this “trend”. Though if HP is unwilling or unable to get peer motion working between their products when things like Storage vMotion and Oracle ASM can do this higher up in the stack, there certainly is a case for storage intelligence at the application layer.

Best of Breed

David also seemed to harp a lot about best of breed. He knocked his competitors for having a mis mash of technologies, specifically he mentioned market leading technologies instead of best of breed. Early in his presentation he touted HP’s market leading position in servers, and their #2 position in networking (you could say that is market leading).

He also tried to justify that the HP integrated cloud stack is comprised of best of breed technologies, it just happens to be two out of the three are considered market leading, no coincidence there.

Best of breed is really a perception issue when you get down to it. Where do you assign value in the technology. Do you just want the cheapest you can get? Do you want the most advanced software? Do you want the fastest system? Do you want the most reliable? Ease of use? interoperable ? flexibility? buzz word compliant? Big name brand?

Because of that, many believe these vertically integrated stacks won’t go very far. There will be some customer wins of course, but those will more often then not be wins based on technology but based on other factors, political (most likely), financial (buy from us and we’ll finance it all no questions asked), or maybe just the warm and fuzzy feeling incompetent CIOs get when they buy from a big name that says they will stand behind the products.

I did ask David what is HP’s stance on a more open design for this “cloud” thing. Not building a cloud based on a vertically integrated stack. His response was sort of what I expected – none of the other stack vendors are open, we aren’t either so we don’t view it as an important point.

I was kind of sad that he never used the 3cV term, really, I think was likely the first stack that was out there, and it wasn’t even official, there was no big marketing or sales push behind it.

For me, my best of breed storage is 3PAR, it may have 1 or 2% market share (more now), so it surely is not market leading(might make it there with HP behind it), but for my needs it’s best of breed.

Switching, likewise, Extreme – maybe 1 – 1.5% market share, not market leading either, but for me, best of breed.

Fibre Channel – I like Qlogic. Probably not best of breed, certainly not market leading at least for switches, but damn easy to use and it gets the job done for me. Ironically enough while digging up links for this post I came across this, which is an article suggesting Qlogic should buy Extreme, back in 2009. I somewhat fear, the most likely company to buy Extreme at this point is Oracle. I hope Oracle does not buy them, but Oracle is trying to play the whole stack game too and they don’t really have any in house networking, unlike the other players. Maybe Oracle will jump on someone like Arista instead, be a cheaper price, ala Pillar.

Servers – I do like HP best of course for the enterprise space – they don’t compete as well in scale out though.

Vmware on the other hand happens to be in a somewhat unique position being both the market leader and for the most part best of breed, though others are rapidly closing the gap in the breed area, Vmware had many years with no competition.

Summary

All in all it was pretty good, a lot more formal than I was expecting, I saw 3 people I knew from 3PAR, I sort of expected to see more (I fully was expecting to see Marc Farley there! where were you Marc!).

David did harp on quite a bit using Intel processors, something that Chuck from EMC likes to harp on too. I did not ask this question of David, because I think I know the answer. The question would be does he think HP will migrate to Intel CPUs, and away from their purpose built ASIC? I think the answer to that question is no, at least not in the next few years(barring some revolution in general purpose processors). With 3PAR’s distributed design I’m just not sure how a general purpose CPU could handle calculating the parity for what could be as many as say half a million RAID arrays on a storage system like the V800 without the assistance of an ASIC or FPGA. I really do not like HP’s pushing of the Intel brand so much being a partner of HP and all, at least with regards to 3PAR. Because really – the Intel CPU does not do much at all in the 3PAR boxes, it never has.

Just look at IBM XIV – they do distributed RAID, though they do mirroring only, and they top out at 180 disks, even with 60 2.4Ghz Intel CPU cores (120 threads), with a combined 180MB of CPU cache. Mirroring is a fairly simple task vs doing parity calculations.

Frankly, I’d rather see an AMD processor in there, especially one with 12-16 cores. The chipsets that power the higher end Intel processors are fairly costly, vs AMD’s chipset scales down to 1 CPU socket without an issue. I look at a storage system that has dual or quad core CPUs in it and I think what a waste. Things may be different if the storage manufacturers included the latest & greatest Intel 8 and 10 core processors but thus-far I have not seen anything close to that.

David also mentioned a road VMware is traveling, moving away from file systems to support VMs to a 1:1 relationship between LUNs and VMs, making life a lot more granular. He postulated (there’s a word I’ve never used before) that this technology(forgot the name) will make protocols like NFS obsolete(at least when it comes to hosting VMs), which was a really interesting concept to me.

At the end of the day, for the types of storage systems I have managed myself, for the types of companies I have worked for I don’t get enough bang for the buck out of many of the more advanced software technologies on these storage systems. Whether it is simple replication stuff, or space reclamation, I’m still not sold on Adaptive optimization or other automatic storage tiering techniques, even application aware snapshots. Sure these are all nice features to have, but they are low on my priority list, I have other things I want to buy first before I invest in these things, myself at least. If money is no object – sure, load up on everything you got! I feel the same way about VMware and their software value add strategy. Give me the basics and let me go from there.  Basics being a solid underlying system that is high performance, predictable and easy to manage.

There was a lot of talk about cloud and their integrated stacks and stuff like that but that was about as interesting to me as sitting through a NetApp presentation. At least with most of the NetApp presentations I sat through I got some fancy steak to go with it, just some snacks at this HP event.

One more question I have for 3PAR – what the hell is your service processor running that requires 317W of power! Is it using Intel technology circa 2004 ?

This actually ended up being a lot longer than I had originally anticipated, nearly 4200 words!

Linear scalability

Filed under: Storage — Tags: , — Nate @ 9:56 am

So 3PAR released their SPC-1 results for their Mac daddy P10000, and the results aren’t as high as I originally guessed they might be.

HP claims it is a world record result for a single system. I haven’t had the time yet to try to verify but they are probably right.

I’m going to a big HP/3PAR event later today and will ask my main question – was the performance constrained by the controllers or by the disks? I’m thinking disks, given the IOPS/disk numbers below.

Here’s some of the results

SystemDate
Tested
SPC-1
IOPS
IOPS
per
Disk
SPC-1 Cost
per IOP
SPC-1 Cost
per usable
TB
3PAR V80010/17/2011450,212234$6.59$12,900
3PAR F4004/27/200993,050242$5.89$20,308
3PAR T8009/2/2008224,989175$9.30$26,885

The cost per TB number was slashed because they are using disks that are much larger (300GB vs 147GB on earlier tests).

The cost was pretty reasonable as well coming in at under $7/IOP which is actually less than their previous results on their T800 from 2008 which was already cheap at $9.30/IOP.

It is interesting that they used Windows to run the test, which is a first for them I believe, having used real Unix in the past (AIX and Solaris for T800 and F400 respectively).

The one kind of strange thing, which is typical in 3PAR SPC-1 numbers is the sheer number of volumes they used (almost 700). I’m not sure what the advantage would be to doing that, another question I will try to seek the answer to.

The system was, as expected, remarkably easy to configure, the entire storage configuration process consisted of this

for n in {0..7}
do
	for s in 1 4 7
do
	if(($s==1))
then
	for p in 4
do
	controlport offline -f $n:$s:$p
	controlport config host -ct point -f $n:$s:$p
	controlport rst -f $n:$s:$p
done
fi

for p in 2
do
	controlport offline -f $n:$s:$p
	controlport config host -ct point -f $n:$s:$p
	controlport rst -f $n:$s:$p
done
done
done

PORTS[0]=":7:2"
PORTS[1]=":1:2"
PORTS[2]=":1:4"
PORTS[3]=":4:2"

for nd in {0..7}
do
 createcpg -t r1 -rs 120 -sdgs 120g -p -nd $nd cpgfc$nd

for hba in {0..3}
do
 for i in {0..14} ; do
 id=$((1+60*nd+15*hba+i))
 createvv -i $id cpgfc${nd} asu1.${id} 240g;
 createvlun -f asu1.${id} $((15*nd+i+1)) ${nd}${PORTS[$hba]}
done
for i in {0..1} ; do
 id=$((681+8*nd+2*hba+i))
 j=$((id-680))
 createvv -i $id cpgfc${nd} asu3.${j} 360g;
 createvlun -f asu3.${j} $((2*nd+i+181)) ${nd}${PORTS[$hba]}
done
for i in {0..3} ; do
 id=$((481+16*nd+4*hba+i))
 j=$((id-480))
 createvv -i $id cpgfc${nd} asu2.${j} 840g;
 createvlun -f asu2.${j} $((4*nd+i+121)) ${nd}${PORTS[$hba]}
done
done
done

Think about that, a $3 million storage system(after discount) configured in less than 50 lines of script?

Not a typical way to configure a system, I had to look at it a couple of times but it seems they are still pinning volumes to particular controller pairs, and LUNs to particular FC ports. This is what they have done in the past so it’s nothing new but I would like to see how the system runs without such pinning of resources and let the inter-node routing do it’s magic, since that is how the customers would run the system.

But that’s what full disclosure is all about right! Another reason I like the SPC-1, is the in depth configuration information that you don’t need an NDA to see(and in 3PAR’s case you probably don’t need to attend a 3-week training course to understand!)

I’m trying to think of one but I can’t think of another storage architecture out there that scales as well as the 3PAR Inspire architecture from the low end(F200) to the high end(V800).

The cost of the V800 was a lot more reasonable than I was fearing it might be, it’s only roughly 45% more expensive than the T800 tested in September 2008, for that extra 45% you get 50% more disks, double the I/O capacity, almost three times the usable capacity. Oh, and five times more data cache, and 8 times more control cache to boot!

I’m suspecting the ASICs are not being pushed to their limits here in the V800, and that the system can go quite a bit faster provided there is not a I/O bottleneck on the disks behind the controllers.

On the backs of these numbers The Register is reporting HP is beefing up the 3PAR sales team after having experienced massive growth over the past year, seems to be at least roughly 300% increase in sales over the past year, so much that they are having a hard time keeping up with demand.

I haven’t been happy with the hefty price increases HP has put into the 3PAR product line though in a lot of cases those come back out in the form of discounts. I guess it’s what the market will bear right – as long as things are selling as fast as they can make them HP doesn’t have any need to reduce the price.

I saw an interview with the chairman of HP a few weeks ago, when they announced their new CEO. He mentioned how 3PAR had exceeded their sales expectations significantly for justification for paying that lofty price to acquire them about a year ago.

So congrats to 3PAR, I knew you could do it!

October 6, 2011

Fusion IO enhances MySQL performance further

Filed under: Storage — Tags: , , , — Nate @ 7:12 am

This seems pretty neat. Not long ago Fusion IO announced their first new real product refresh in quite a while which offers significantly enhanced performance.

Today I see another related article that goes into something more specific, from Data Center Knowledge

Fusion-io also announced a new extension to its VSL (Virtual Storage Layer) software subsystem for conducting Atomic Writes in the popular MySQL open source database. Atomic Writes are an operation in which a processor can simultaneously write multiple independent storage sectors as a single storage transaction. This accelerates mySQL and gives new features powered by the flexibility of sophisticated flash architectures. With the new Atomic Writes extension, Fusion-io testing has observed 35 percent more transactions per second and a 2.5x improvement in performance predictability compared to conducting the same MySQL tests without the Atomic Writes feature.

I know that Facebook is a massive user of Fusion IO for their MySQL database farms, I suspect this feature was made for them! Though it can benefit everyone.

My only question would be can this Atomic write capability be used by MySQL when running through the ESX storage layer, or does there need to be more native access from the OS.

About the new product lines, from The Register

The ioDrive 2 comes in SLC form with capacities of 400GB and 600GB. It can deliver 450,000 write IOPS working with 512-byte data blocks and 350,000 read IOPS. These are whopping great increases, 3.3 times faster for the write IOPS number, over the original ioDrive SLC model which did 135,000 write IOPS and 140,000 read IOPS. It delivered sequential data at 750-770MB/sec whereas the next-gen product does it at 1.5GB/sec, around two times faster.
[..]
All the products will ship in November. Prices start from $5,950

The cards aren’t available yet, wonder how accurate those numbers will end up being? But in any case, even if they were over inflated by a large amount  that’s still an amazing amount of I/O.

On a related note I was just browsing the Fusion IO blog which mentions this MySQL functionality as well and saw that Fusion IO was/is showing off a beefy 8-way HP DL980 with 14 HP-branded IO accelerators  at Oracle Openworld

We’re running Oracle Enterprise Edition database version 11g Release 2 on a single eight processor HP ProLiant DL980 G7 system integrated with 14 Fusion ioMemory-based HP IO Accelerators, achieving performance of more than 600,000 IOPS with over 6GB/s bandwidth using a real world, read/write mixed workload.

[..]

the HP Data Accelerator Solution for Oracle is configured with up to 12TB of high performance flash[..]

After reading that I could not help but think how HP’s own Vertica, with it’s extremely optimized encoding and compression scheme would run on such a beast. I mean if you can get 10x compression out of the system(Vertica’s best-case real world is 30:1 for reference), get a pair of these boxes (Vertica would mirror between the two) and you have upwards of 240TB of data to play with.

I say 240TB because of the way Vertica mirrors the data it allows you to store it in a different sort order on the mirror allowing for even faster access if your querying the data in different ways. Who knows – with the compression you may be able to get much better than 10:1 depending on your data.

Vertica is so fast that you will probably end up CPU bound more than anything else – 80 cores per server is quite a lot though! The DL980 supports up to 16 PCI Express slots so even with 14 cards that still leaves room for a couple 10GigE ports and/or Fibre channel or some other form of connectivity other than what’s on the motherboard (which seems to have an optional dual port 10GbE NIC)

With Vertica’s licensing (last I checked) starting in the 10s of thousands of dollars per raw TB (before compression), it falls into the category for me to blow a ton of money on hardware to make it run the best it possibly can (same goes for Oracle – though Standard Edition to a lesser degree). Vertica is coming out with a Community Edition soon which I believe is free, I don’t recall what the restrictions are I think one of them was it was limited to a single server, I don’t recall yet hearing on what the storage limits might be(I’d assume there would be some limit maybe half a TB or something)

September 8, 2011

HDS aborbs Bluearc

Filed under: Storage — Tags: , , — Nate @ 12:07 am

It seems HDS has finally decided to buy out BlueArc after what was either two or three failed attempts at an IPO.

BlueArc, along with my buddies over at 3PAR is among the few storage companies that puts real silicon to work in their system for the best possible performance. Their architecture is quite impressive and the performance (that is for their mid range system) shows.

I have only been exposed to their older stuff (5-6 year old technology) directly, not their newer technology. But even their older stuff was very fast and efficient, very reliable and had quite a few nifty features as well. I think they were among the first to do storage tiering (for them at the file level).

[ warning – a wild tangent gets thrown in here somewhere ]

While their NAS technology was solid(IMO), their disk technology was not. They relied on LSI storage, and the quality of the storage was very poor over all. First off whoever setup the system we had set it up with everything running RAID 5 12+1, then there was the long RAID rebuild times, the constant moving hot spots because of the number of different tiers of storage we had, the fact that the 3 Titan head units were not clustered so we had to take hard downtime for software upgrades(not BlueArc’s fault other than perhaps making it too expensive to be able to do clustered heads when the company bought the stuff – long before I was there). Every time we engaged with BlueArc 95% of our complaints were about the disk. For the longest time they tried to insist that “disk doesn’t matter”. That you could put any storage system behind the BlueArc and it would be the same.

After the 3rd or 4th engagement BlueArc significantly changed their tune (not sure what prompted it), but now acknowledged the weakness of the low tier storage and was promoting the use of HDS AMS storage (USP was well, waaaaaaaay out of our price range) since they were a partner of HDS back then as well. The HDS proposal fell far short of the design I had with 3PAR and at the time Exanet was their partner of choice.

If I could of chosen I would of used BlueArc for NAS and 3PAR for disk. 3PAR was open to the prospect of course, BlueArc claimed they had contacted 3PAR to start working with them but 3PAR said that never happened. Later BlueArc acknowledged they were not going to try to work with 3PAR (or any other storage company other than LSI or HDS – I think 3PAR was one digit too long for them to handle).

Given the BlueArc system lacked the ability to provide us with any truly useful disk performance statistics, it was tough coming up with a configuration that I thought would work as a replacement. There was a large number of factors involved, and any one of them had a fairly wide margin of error. You could say I pulled a number out of my ass, but I did do more calculations than that I have about a dozen pages of documentation I wrote at the time on the project, but really at the end of the day it was a stab in the dark as far as initial configuration.

BlueArc as a company, at the time didn’t really have their support stuff all figured out yet. The first sign was when we had scheduled downtime for a software upgrade that was intended to take 2-3 hours ended up taking 10-11 hours because there was a problem and BlueArc lacked the proper escalation procedures to resolve it quick enough. Their CEO sent us a letter later saying that they fixed that process in the company. The second sign was when I went to them and asked them to confirm the drive type/size of all of our disks so I could do some math for the replacement system. They did a new audit(had to be on site to do it for some reason), and turns out we had about 80 more spindles than they thought we had(we bought everything through them). I don’t know how you lose track of that amount of disks for support but somehow it fell through the cracks. Another issue we had was we paid BlueArc to relocate the system to another facility(again before I was at the company), and whomever moved it didn’t do a good job, they accidentally plugged both power supplies of a single shelf into the same PDU. Fortunately it was a non production system. A PSU blew at one point that took out the PDU, which then took out that shelf which then took out the file system the shelf was on.

Even after all of that my main problem with their solution was the disks. LSI was not up to snuff and the proposal from HDS wasn’t going to cut it. I told my management that there is no doubt that HDS could come up with a solution that would work — it’s just what they have proposed will not(they didn’t even have thin provisioning at the time. 3PAR was telling me HDS was pairing USP-Vs along with AMSs in order to try to compete in the meantime. They did not propose that to us). A combination of poor performing SATA on RAID-6 no less for bulk storage and higher performing 15k RPM disks for higher tier/less storage. HDS/BlueArc felt it was equivalent to what I had specified through 3PAR and Exanet, not understanding the architectural advantages the 3PAR system had over the proposed HDS design(going into specifics will take too long you probably know them by now anyways if your here). Not to mention what seemed like sheer incompetence among the HDS team that was supporting us, it seemed nothing I asked them they could answer without engaging someone from Japan and even then I rarely got a comprehensible answer.

So in the end we ended up replacing a 4-rack BlueArc system with what could of been a single rack 3PAR + a few rack units for the Exanet but we had to put the 3PAR in two racks due to weight constraints in the data center. We went from 500+ disks (mix of SATA-I and 10k RPM FC) to 200 disks of SATA-II (RAID 5 5+1). With the change we got the advantage of being able to run fibre channel (which we ran to all of our VM boxes as well as primary databases), iSCSI (which we used here and there 3PAR’s iSCSI support has never been as good as I would of liked to have seen it though for anything serious I’d rather use FC anyways and that’s what 3PAR’s customers did which led to some neglect on the iSCSI front).

Half the floor space, half the power usage, roughly the same amount of usable storage, about the same amount of raw storage. We put the 3PAR/Exanet system through it’s paces with our most I/O intensive workload at the time and it absolutely screamed. I mean it exceeded everyone’s expectations(mine included). But that was only the begining.

This is a story I like to tell on 3PAR reference calls when I do them which is becoming more and more rare these days. In the early days of our 3PAR/Exanet deployment the Exanet engineer tried to assure me that they were thin provisioning friendly, he had personally used 3PAR+Exanet in the past and it worked fine. So with time constraints and stuff I provisioned a file system on the Exanet box not thinking too much about the 3PAR end of things. It’s thin provisioning friendly right? RIGHT?

Well not so much, before you knew it the system was in production and we started dumping large amounts of data on it, and deleting large amounts of data on it, I found out in a few weeks the Exanet box was preferring to allocate new space rather than reclaim deleted space. I did some calculations and the result was not good. If we let the system continue at this rate we were going to exceed the amount of capacity on the 3PAR box if the Exanet file system was allowed to grow to it’s full potential. Not good.. Compound that with the fact that we were at the maximum addressable capacity of a 2-node 3PAR box, if I had to add even 1 more disk to the system(not that adding 1 disk is possible in that system due to the way disks are added, minimum is 4), I would of had to put in 2 more controllers. Which as you might expect is not exactly cheap. So I was looking at what could of been either a very costly downtime to do data migration or a very costly upgrade to correct my mistake.

Dynamic optimization to the rescue. This really saved my ass. I mean really, it did. When I built the system I used RAID 5 3+1 for performance (for 3PAR that is roughly ~8% slower than RAID 10, and 3PAR fast RAID 5 is probably on par with many other vendors RAID 10 due to the architecture).

So I ran some more calculations and determined if I could get to RAID 5 5+1 I would have enough space to survive. So I began the process, converting roughly a half dozen LUNs at a time. 24 hours a day, 7 days a week. It took longer than I expected, the 3PAR was routinely getting hammered from daily activities from all sides. It took about 5 months in the end to convert all of the volumes. Throughout the process nobody noticed a thing. The array was converting volumes for 24 hours a day for 5 months straight and nobody noticed (except me who was baby sitting it hoping I could beat the window). If I recall right I probably had 3-4 weeks of buffer space, if my conversions was going to take an extra month I would of exceeded the capacity of the system. So, I got lucky I suppose, but also bought the system knowing I could make such course corrections online without impacting applications for just that kind of event — I just didn’t expect the event to be so soon and on such a large scale.

One of the questions I had for HDS at the time we were looking at them was could they do the same online RAID conversions. The answer? Absolutely they can. But the fine print was (assuming it still is) you needed blank disks to do the migration to. Since 3PAR’s RAID is performed at the sub disk level no blank disks are required, only blank “chunklets” as they call them. Basically you just need enough empty space on the array to mirror the LUN/Volume to the new RAID level and then you break the mirror and eliminate the source (this is all handled transparently with a single command and some patience depending on system load and volume of data in flight).

As time went on we loaded the system with ever more processes and connected ever more systems to it as we got off the old BlueArc(s). I kept seeing the IOPS (and disk response times) on the 3PAR going up..and up.. I really thought it was going to choke, I mean we were pushing the disks hard, with sustained disk response times in the 40-50ms range at times(with rare spikes to well over 100ms). I just kept hoping for the day when we would be done and the increase would level off, and it did for the most part eventually. I built my own custom monitoring system for the array for performance trending, since I didn’t like the query based tool they provided as much as what I could generate myself(despite the massive amount of time it took to configure my tool).

I did not know a 7200RPM SATA disk could do 127 IOPS of random I/O.

We had this one process that would dump up to say 50GB of data from upwards of 40-50 systems simultaneously as fast as they could go. Needless to say when this happened it blew out all of the caches across the board and brought things to a grinding halt for some time(typically 30-60 seconds). I would see the behavior on the NAS system and login to my monitoring tool and just see it hang while it tried to query the database(which was on the storage). I would cringe, and wait for the system to catch up. We tried to get them to re-design the application so it was more thoughtful of the storage but they weren’t able to. Well they did re-design it one time (for the worse). I tried to convince them to put it on fusion IO on local storage in the servers but they would have no part of it. Ironically not long after I left the company they went out and bought some Fusion IO for another project. I guess as long as the idea was not mine it was a good one.. The storage system was entirely a back office thing, no real time end user transactions ever touched it, which meant we could live with the higher latency by pushing the SATA drives 30-50% beyond engineering specifications.

At the end of the first full year of operation we finally got budget to add capacity to the system, we had shrunk the overall theoretical I/O capacity by probably 2/3rds vs the previous array, and had absorbed almost what seemed like a 200% growth on top of that during the first year and the system held up. I probably wouldn’t of believed it if I didn’t see it(and live it) personally. I hammered 3PAR as often as I could to increase the addressable capacity of their systems which was limited by the operating system architecture. Doesn’t take a rocket scientist to see that their systems had 4GB of control cache(per controller) which is a common limit to 32-bit software. But the software enhancement never came while I was there at least, it is there in some respect in the new V-class, though as mentioned the V-class seems to have had an arbitrary raw capacity limit placed on it that does not align with the amount of control cache it can have (up to 32GB per controller). With 64-bit software and more control cache I could of doubled or tripled the capacity of the system without adding controllers.

Adding the two extra controllers did give us one thing I wanted – Persistent cache, that’s just an awesome technology to have and you simply can’t do that kind of thing on a 2-controller system. Also gave us more ports than I knew what to do with.

What happened to the BlueArc? Well after about 10 months of trying to find someone to sell it to – or give it to — we ended up paying someone to haul it away. When HDS/BlueArc was negotiating with us on their solution they tried to harp on how we could leverage our existing disk from BlueArc in the new solution as another tier. I didn’t have to say it my boss did which made me sort of giggle – he said the operational costs of running the old BlueArc disk (support was really high, + power and co-lo space) was more than the disks were worth, BlueArc/HDS didn’t have any real response to that. Other than perhaps to nod their heads acknowledging that we’re smart enough to realize that fact.

I still would like to use BlueArc again, I think it’s a fine platform, I just want to use my own storage on it 🙂

This ended up being a lot longer than I expected! Hope you didn’t fall asleep. Just got right to 2600 words.. there.

August 29, 2011

Farewell Terremark – back to co-lo

Filed under: General,Random Thought,Storage,Virtualization — Tags: , , , — Nate @ 9:43 pm

I mentioned not long ago that I was going co-lo once again. I was co-lo for a while for my own personal services but then my server started to act up (the server was 6 years old if it was still alive today) with disk “failure” after failure (or at least that’s what the 3ware card was predicting eventually it stopped complaining and the disk never died again). So I thought – do I spent a few grand to buy a new box or go “cloud”. I knew up front cloud would cost more in the long run but I ended up going cloud anyways as a stop gap – I picked Terremark because it had the highest quality design at the time(still does).

During my time with Terremark I never had any availability issues, there was one day where there was some high latency on their 3PAR arrays though they found & fixed whatever it was pretty quick (didn’t impact me all that much).

I had one main complaint with regards to billing – they charge $0.01 per hour for each open TCP or UDP port on their system, and they have no way of doing 1:1 NAT. For a web server or something this is no big deal, but for me I needed a half dozen or more ports open per system(mail, dns, vpn, ssh etc) after cutting down on ports I might not need, so it starts to add up, indeed about 65% of my monthly bill ended up being these open TCP and UDP ports.

Once both of my systems were fully spun up (the 2nd system only recently got fully spun up as I was too lazy to move it off of co-lo) my bill was around $250/mo. My previous co-lo was around $100/mo and I think I had them throttle me to 1Mbit of traffic (this blog was never hosted at that co-lo).

The one limitation I ran into on their system was that they could not assign more than 1 IP address for outbound NAT per account. In order to run SMTP I needed each of my servers to have their own unique outbound IP. So I had to make a 2nd account to run the 2nd server. Not a big deal(for me, ended up being a pain for them since their system wasn’t setup to handle such a situation), since I only ran 2 servers (and the communications between them were minimal).

As I’ve mentioned before, the only part of the service that was truly “bill for what you use” was bandwidth usage, and for that I was charged between 10-30 cents/month for my main system and 10 cents/month for my 2nd system.

Oh – and they were more than willing to setup reverse DNS for me which was nice (and required for running a mail server IMO). I had to agree to a lengthy little contract that said I wouldn’t spam in order for them to open up port 25. Not a big deal. The IP addresses were “clean” as well, no worries about black listing.

Another nice thing to have if they would of offered it is billing based on resource pools, as usual they charge for what you provision(per VM) instead of what you use. When I talked to them about their enterprise cloud offering they charged for the resource pool (unlimited VMs in a given amount of CPU/memory), but this is not available on their vCloud Express platform.

It was great to be able to VPN to their systems to use the remote console (after I spent an hour or two determining the VPN was not going to work in Linux despite my best efforts to extract linux versions of the vmware console plugin and try to use it). Mount an ISO over the VPN and install the OS. That’s how it should be. I didn’t need the functionality but I don’t doubt I would of been able to run my own DHCP/PXE server there as well if I wanted to install additional systems in a more traditional way. Each user gets their own VLAN, and is protected by a Cisco firewall, and load balanced by a Citrix load balancer.

A couple of months ago the thought came up again of off site backups. I don’t really have much “critical” data but I felt I wanted to just back it all up, because it would be a big pain if I had to reconstruct all of my media files for example. I have about 1.7TB of data at the moment.

So I looked at various cloud systems including Terremark but it was clear pretty quick no cloud company was going to be able to offer this service in a cost effective way so I decided to go co-lo again. Rackspace was a good example they have a handy little calculator on their site. This time around I went and bought a new, more capable server.

So I went to a company I used to buy a ton of equipment from in the bay area and they hooked me up with not only a server with ESXi pre-installed on it but co-location services (with “unlimited” bandwidth), and on-site support for a good price. The on-site support is mainly because I’m using their co-location services(which in itself is a co-lo inside Hurricane Electric) and their techs visit the site frequently as-is.

My server is a single socket quad core processor, 4x2TB SAS disks (~3.6TB usable which also matches my usable disk space at home which is nice – SAS because VMware doesn’t support VMFS on SATA though technically you can do it the price premium for SAS wasn’t nearly as high as I was expecting), 3ware RAID controller with battery backed write-back cache, a little USB thing for ESXi(rather have ESXi on the HDD but 3ware is not supported for booting ESXi), 8GB Registered ECC ram and redundant power supplies. Also has decent remote management with a web UI, remote KVM access, remote media etc. For co-location I asked (and received) 5 static IPs (3 IPs for VMs, 1 IP for ESX management, 1 IP for out of band management).

My bandwidth needs are really tiny, typically 1GB/month. Though now with off site backups that may go up a bit (in bursts). Only real drawback to my system is the SAS card does not have full integration with vSphere so I have to use a cli tool to check the RAID status, at some point I’ll need to hook up nagios again and run a monitor to check on the RAID status. Normally I setup the 3Ware tools to email me when bad things happen, pretty simple, but not possible when running vSphere.

The amount of storage on this box I expect to last me a good 3-5 years. The 1.7TB includes every bit of data that I still have going back a decade or more – I’m sure there’s a couple hundred gigs at least I could outright delete because I may never need it again. But right now I’m not hurting for space so I keep it there, on line and accessible.

My current setup

  • One ESX virtual switch on the internet that has two systems on it – a bridging OpenBSD firewall, and a Xangati system sniffing packets(still playing with Xangati). No IP addresses are used here.
  • One ESX virtual switch for one internal network, the bridging firewall has another interface here, and my main two internet facing servers have interfaces here, my firewall has another interface here as well for management. Only public IPs are used here.
  • One ESX virtual switch for another internal network for things that will never have public IP addresses associated with them, I run NAT on the firewall(on it’s 3rd/4th interfaces) for these systems to get internet access.

I have a site to site OpenVPN connection between my OpenBSD firewall at home and my OpenBSD firewall on the ESX system, which gives me the ability to directly access the back end, non routable network on the other end.

Normally I wouldn’t deploy an independent firewall, but I did in this case because, well I can. I do like OpenBSD’s pf more than iptables(which I hate), and it gives me a chance to play around more with pf, and gives me more freedom on the linux end to fire up services on ports that I don’t want exposed and not have to worry about individually firewalling them off, so it allows me to be more lazy in the long run.

I bought the server before I moved, once I got to the bay area I went and picked it up and kept it over a weekend to copy my main data set to it then took it back and they hooked it up again and I switched my systems over to it.

The server was about $2900 w/1 year of support, and co-location is about $100/mo. So disk space alone the first year(taking into account cost of the server) my cost is about $0.09 per GB per month (3.6TB), with subsequent years being $0.033 per GB per month (took a swag at the support cost for the 2nd year so that is included). That doesn’t even take into account the virtual machines themselves and the cost savings there over any cloud. And I’m giving the cloud the benefit of the doubt by not even looking at the cost of bandwidth for them just the cost of capacity. If I was using the cloud I probably wouldn’t allocate all 3.6TB up front but even if you use 1.8TB which is about what I’m using now with my VMs and stuff the cost still handily beats everyone out there.

What’s the most crazy is I lack the purchasing power of any of these clouds out there, I’m just a lone consumer, that bought one server. Granted I’m confident the vendor I bought from gave me excellent pricing due to my past relationship, though probably still not on the scale of the likes of Rackspace or Amazon and yet I can handily beat their costs without even working for it.

What surprised me most during my trips doing cost analysis of the “cloud” is how cheap enterprise storage is. I mean Terremark charges $0.25/GB per month(on SATA powered 3PAR arrays), Rackspace charges $0.15/GB per month(I believe Rackspace just uses DAS). I kind of would of expected the enterprise storage route to cost say 3-5x more, not less than 2x. When I was doing real enterprise cloud pricing storage for the solution I was looking for typically came in at 10-20% of the total cost, with 80%+ of the cost being CPU+memory. For me it’s a no brainier – I’d rather pay a bit more and have my storage on a 3PAR of course (when dealing with VM-based storage not bulk archival storage). With the average cost of my storage for 3.6TB over 2 years coming in at $0.06/GB it makes more sense to just do it myself.

I just hope my new server holds up, my last one lasted a long time, so I sort of expect this one to last a while too, it got burned in before I started using it and the load on the box is minimal, would not be too surprised if I can get 5 years out of it – how big will HDDs be in 5 years?

I will miss Terremark because of the reliability and availability features they offer, they have a great service, and now of course are owned by Verizon. I don’t need to worry about upgrading vSphere any time soon as there’s no reason to go to vSphere 5. The one thing I have been contemplating is whether or not to put my vSphere management interface behind the OpenBSD firewall(which is a VM of course on the same box). Kind of makes me miss the days of ESX 3, when it had a built in firewall.

I’m probably going to have to upgrade my cable internet at home, right now I only have 1Mbps upload which is fine for most things but if I’m doing off site backups too I need more performance. I can go as high as 5Mbps with a more costly plan. 50Meg down 5 meg up for about $125, but I might as well go all in and get 100meg down 5 meg up for $150, both plans have a 500GB cap with $0.25/GB charge for going over. Seems reasonable. I certainly don’t need that much downstream bandwidth(not even 50Mbps I’d be fine with 10Mbps), but really do need as much upstream as I can get. Another option could be driving a USB stick to the co-lo, which is about 35 miles away, I suppose that is a possibility but kind of a PITA still given the distance, though if I got one of those 128G+ flash drives it could be worth it. I’ve never tried hooking up USB storage to an ESX VM before, assuming it works? hmmmm..

Another option I have is AT&T Uverse, which I’ve read good and bad things about – but looking at their site their service is slower than what I can get through my local cable company (which truly is local, they only serve the city I am in). Another reason I didn’t go with Uverse for TV is due to the technology they are using I suspected it is not compatible with my Tivo (with cable cards). Though AT&T doesn’t mention their upstream speeds specifically I’ll contact them and try to figure that out.

I kept the motherboard/cpus/ram from my old server, my current plan is to mount it to a piece of wood and hang it on the wall as some sort of art. It has lots of colors and little things to look at, I think it looks cool at least. I’m no handyman so hopefully I can make it work. I was honestly shocked how heavy the copper(I assume) heatsinks were, wow, felt like 1.5 pounds a piece, massive.

While my old server is horribly obsolete, one thing it does have even on my new server is being able to support more ram. Old server could go up to 24GB(I had a max of 6GB at the time in it), new server tops out at 8GB (have 8GB in it). Not a big deal as I don’t need 24GB for my personal stuff but just thought it was kind of an interesting comparison.

This blog has been running on the new server for a couple of weeks now. One of these days I need to hook up some log analysis stuff to see how many dozen hits I get a month.

If Terremark could fix three areas of their vCloud express service – one being resource pool-based billing,  another being relaxing the costs behind opening multiple ports in the firewall (or just giving 1:1 NAT as an option), and the last one being thin provisioning friendly billing for storage — it would really be a much more awesome service than it already is.

« Newer PostsOlder Posts »

Powered by WordPress