TechOpsGuys.com Diggin' technology every day

September 9, 2010

Availability vs Reliability with ZFS

Filed under: Storage — Tags: — Nate @ 7:37 pm

ZFS doesn’t come up with me all that often, but with the recent news of the settlement of the suites I wanted to talk a bit about this.

It all started about two years ago, some folks at the company I was at were proposing using cheap servers and ZFS to address our ‘next generation’ storage needs, at the time we had a bunch of tier 2 storage behind some really high end NAS head units(not configured in any sort of fault tolerant manor).

Anyways in doing some research I came across a fascinating email thread, the most interesting post was this one, and I’ll put it here because really I couldn’t of said it better myself –

I think there’s a misunderstanding concerning underlying concepts. I’ll try to explain my thoughts, please excuse me in case this becomes a bit lengthy. Oh, and I am not a Sun employee or ZFS fan, I’m just a customer who loves and hates ZFS at the same time

You know, ZFS is designed for high *reliability*. This means that ZFS tries to keep your data as safe as possible. This includes faulty hardware, missing hardware (like in your testing scenario) and, to a certain degree, even human mistakes.

But there are limits. For instance, ZFS does not make a backup unnecessary. If there’s a fire and your drives melt, then ZFS can’t do anything. Or if the hardware is lying about the drive geometry. ZFS is part of the operating environment and, as a consequence, relies on the hardware.

so ZFS can’t make unreliable hardware reliable. All it can do is trying to protect the data you saved on it. But it cannot guarantee this to you if the hardware becomes its enemy.

A real world example: I have a 32 core Opteron server here, with 4 FibreChannel Controllers and 4 JBODs with a total of [64] FC drives connected to it, running a RAID 10 using ZFS mirrors. Sounds a lot like high end hardware compared to your NFS server, right? But … I have exactly the same symptom. If one drive fails, an entire JBOD with all 16 included drives hangs, and all zpool access freezes. The reason for this is the miserable JBOD hardware. There’s only one FC loop inside of it, the drives are connected serially to each other, and if one drive dies, the drives behind it go downhill, too. ZFS immediately starts caring about the data, the zpool command hangs (but I still have traffic on the other half of the ZFS mirror!), and it does the right thing by doing so: whatever happens, my data must not be damaged.

A “bad” filesystem like Linux ext2 or ext3 with LVM would just continue, even if the Volume Manager noticed the missing drive or not. That’s what you experienced. But you run in the real danger of having to use fsck at some point. Or, in my case, fsck’ing 5 TB of data on 64 drives. That’s not much fun and results in a lot more downtime than replacing the faulty drive.

What can you expect from ZFS in your case? You can expect it to detect that a drive is missing and to make sure, that your _data integrity_ isn’t compromised. By any means necessary. This may even require to make a system completely unresponsive until a timeout has passed.

But what you described is not a case of reliability. You want something completely different. You expect it to deliver *availability*.

And availability is something ZFS doesn’t promise. It simply can’t deliver this. You have the impression that NTFS and various other Filesystems do so, but that’s an illusion. The next reboot followed by a fsck run will show you why. Availability requires full reliability of every included component of your server as a minimum, and you can’t expect ZFS or any other filesystem to deliver this with cheap IDE hardware.

Usually people want to save money when buying hardware, and ZFS is a good choice to deliver the *reliability* then. But the conceptual stalemate between reliability and availability of such cheap hardware still exists – the hardware is cheap, the file system and services may be reliable, but as soon as you want *availability*, it’s getting expensive again, because you have to buy every hardware component at least twice.

So, you have the choice:

a) If you want *availability*, stay with your old solution. But you have no guarantee that your data is always intact. You’ll always be able to stream your video, but you have no guarantee that the client will receive a stream without drop outs forever.

b) If you want *data integrity*, ZFS is your best friend. But you may have slight availability issues when it comes to hardware defects. You may reduce the percentage of pain during a disaster by spending more money, e.g. by making the SATA controllers redundant and creating a mirror (than controller 1 will hang, but controller 2 will continue working), but you must not forget that your PCI bridges, fans, power supplies, etc. remain single points of failures why can take the entire service down like your pulling of the non-hotpluggable drive did.

c) If you want both, you should buy a second server and create a NFS cluster.

Hope I could help you a bit,

Ralf

The only thing somewhat lacking from the post is that creating a NFS cluster comes across as not being a very complex thing to do either. Tightly coupling anything really is pretty complicated especially it needs to be stateful (for lack of a better word), in this case the data must be in sync(then there’s the IP-level tolerance, optional MAC takeover, handling fail over of NFS clients to the backup system, failing back, performing online upgrades etc). Just look at the guide Red Hat wrote for building a HA NFS cluster with GFS. Just look at the diagram on page 21 if you don’t want to read the whole thing! Hell I’ll put the digram here because we need more color, note that Red Hat forgot network and fiber switch fault tolerance –

That. Is. A. Lot. Of. Moving. Parts. I was actually considering deploying this at a previous company(not the one that brought up the ZFS discussion), budgets were slashed and I left shortly before the company (and economy) really nose dived.

Also take note in the above example, that only covers the NFS portion of the cluster, they do not talk about how the back end storage is protected. GFS is a shared file system, so the assumption is you are operating on a SAN of some sort. In my case I was planning to use our 3PAR E200 at the time.

Unlike say providing fault tolerance for a network device(setting aside stateful firewalls in this example), since the TCP stack in general is a very forgiving system, storage on the other hand makes so many assumptions about stuff “just working” that you know as well as I do, when storage breaks, usually everything above it breaks hard too, and in crazy complicated ways (I just love to see that “D” in the linux process list after a storage event). Stateful firewall replication is fairly simple by contrast.

Also I suspect that all of the fancy data integrity protection bits are all for naught when running ZFS with things like RAID controllers or higher end storage arrays because of the added abstraction layer(s) that ZFS has no control over, which is probably why so many folks prefer to run RAID in ZFS itself and use “raw” disks.

I think ZFS has some great concepts in it, I’ve never used it because it’s usability on Linux has been very limited (and haven’t had a need for ZFS that was big enough to justify deploying a Solaris system), but certainly give mad props to the evil geniuses who created it.

ZFS Free and clear.. or is it?

Filed under: News,Random Thought,Storage — Tags: , — Nate @ 7:03 pm

So, Sun and Oracle kissed and made up recently over the lawsuits they had against each other, from our best friends at The Register

Whatever the reasons for the mutual agreement to dismiss the lawsuits, ZFS technology product users and end-users can feel relieved that a distracting lawsuit has been cleared away.

Since the terms of the settlement or whatever you want to call it have not been disclosed and there has been no apparent further comment from either side, I certainly wouldn’t jump to the conclusion that other ZFS users are in the clear. I view it as if your running ZFS on Solaris your fine, if your using OpenSolaris your probably fine too. But if your using it on BSD, or even Linux (or whatever other platforms folks have tried to port ZFS to over the years), anything that isn’t directly controlled by Oracle, I wouldn’t be wiping the sweat from my brow just yet.

As is typical with such cases the settlement (at least from what I can see) is specifically between the two companies, there have been no statements or promises from either side from a broader technology standpoint.

I don’t know what OS folks like Coraid, and Compellent use on their ZFS devices, but I recall when investigating NAS options for home use I was checking out Thecus, a model like the N770+ and among the features was a ZFS option. The default file system was ext3, and supported XFS as well. While I am not certain, I was pretty convinced the system was running Linux in order to be supporting XFS and ext3, and not running OpenSolaris. I ended up not going with Thecus because as far as I could tell they were using software RAID. Instead I bought a new workstation(previous computer was many years old), and put a 3Ware 9650SE RAID controller(with a battery backup unit and 256MB of write back cache) along with four 2TB disks(RAID 1+0).

Now as and end user I can see not really being concerned, it is unlikely Netapp or Oracle will go after end users using ZFS on Linux or BSD or whatever, but if your building a product based on it(with the intension of selling/licensing it), and you aren’t using an ‘official’ version, I would stay on my toes. If your product doesn’t compete against any of NetApp’s product lines then you may skirt by without attracting attention. And as long as your not too successful Oracle probably won’t come kicking down your door.

Unless of course further details are released and the air is cleared more about ZFS as a technology in general.

Interestingly enough I was reading a discussion on Slashdot I think, around the time Oracle bought Sun and folks became worried about the future of ZFS in the  open source world. And some were suggesting as far as Linux was concerned btrfs, which is the Linux community’s response to ZFS. Something I didn’t know at the time was that apparently btrfs is also heavily supported by Oracle(or at least it was, I don’t track progress on that project).

Yes I know btrfs is GPL, but as you know I’m sure a file system is a complicated beast to get right. And if Oracle’s involvement in the project is significant and they choose instead to for whatever reason drop support and move resources to ZFS, well that could leave a pretty big gap that will be hard to fill. Just because the code is there doesn’t mean it’s going to magically code itself. I’m sure others contribute, I don’t know what the ratio of support is from Oracle vs outsiders. I recall reading at one point for OpenOffice something like 75-85% of the development was done directly by Sun Engineers. Just something to keep in mind.

I miss reiserfs. I really did like reiserfs v3 way back when. And v4 certainly looked promising (never tried it).

Reminds me of the classic argument that so many make for using open source stuff (not that I don’t like open source, I use it all the time). That is if there is a bug in the program you can go in and fix it yourself. My own experience at many companies is the opposite, they encounter a bug and they go through the usual community channels to try to get a fix. I would say it’s a safe assumption to say in excess of 98% of users of open source code have no ability to comprehend or fix the source they are working with. And that comes from my own experience of working for, really nothing but software companies over the past 10 years. And before anyone asks, I believe it’s equally improbable that a company would hire a contractor to fix a bug in an open source product. I’m sure it does happen, but pretty rare given the number of users out there.

September 7, 2010

vSphere VAAI only in the Enterprise

Filed under: Storage,Virtualization — Tags: , , , , — Nate @ 7:04 pm

Beam me up!

Damn those folks at VMware..

Anyways I was browsing around this afternoon looking around at things and while I suppose I shouldn’t be I was surprised to see that the new storage VAAI APIs are only available to people running Enterprise or Enterprise Plus licensing.

I think at least the block level hardware based locking for VMFS should be available to all versions of vSphere, after all VMware is offloading the work to a 3rd party product!

VAAI certainly looks like it offers some really useful capabiltiies, from the documentation on the 3PAR VAAI plugin (which is free) here are the highlights:

  • Hardware Assisted Locking is a new VMware vSphere storage feature designed to significantly reduce impediments to VM reliability and performance by locking storage at the block level instead of the logical unit number (LUN) level, which dramatically reduces SCSI reservation contentions. This new capability enables greater VM scalability without compromising performance or reliability. In addition, with the 3PAR Gen3 ASIC, metadata comparisons are executed in silicon, further improving performance in the largest, most demanding VMware vSphere and desktop virtualization environments.
  • The 3PAR Plug-In for VAAI works with the new VMware vSphere Block Zero feature to offload large, block-level write operations of zeros from virtual servers to the InServ array, boosting efficiency during several common VMware vSphere operations— including provisioning VMs from Templates and allocating new file blocks for thin provisioned virtual disks. Adding further efficiency benefits, the 3PAR Gen3 ASIC with built-in zero-detection capability prevents the bulk zero writes from ever being written to disk, so no actual space is allocated. As a result, with the 3PAR Plug-In for VAAI and the 3PAR Gen3 ASIC, these repetitive write operations now have “zero cost” to valuable server, storage, and network resources—enabling organizations to increase both VM density and performance.
  • The 3PAR Plug-In for VAAI adds support for the new VMware vSphere Full Copy feature to dramatically improve the agility of enterprise and cloud datacenters by enabling rapid VM deployment, expedited cloning, and faster Storage vMotion operations. These administrative tasks are now performed in half the time. The 3PAR plug-in not only leverages the built-in performance and efficiency advantages of the InServ platform, but also frees up critical physical server and network resources. With the use of 3PAR Thin Persistence and the 3PAR Gen3 ASIC to remove duplicated zeroed data, data copies become more efficient as well.

Cool stuff. I’ll tell you what. I really never had all that much interest in storage until I started using 3PAR about 3 and a half years ago. I mean I’ve spread my skills pretty broadly over the past decade, and I only have so much time to do stuff.

About five years ago some co-workers tried to get me excited about NetApp, though for some reason I never could get too excited about their stuff, sure it has tons of features which is nice, though the core architectural limitations of the platform (from a spinning rust perspective at least) I guess is what kept me away from them for the most part. If you really like NetApp, put a V-series in front of a 3PAR and watch it scream. I know of a few 3PAR/NetApp users that are outright refusing to entertain the option of running NetApp storage, they like the NAS, and keep the V-series but the back end doesn’t perform.

On the topic of VMFS locking – I keep seeing folks pimping the NFS route attack the VMFS locking as if there was no locking in NFS with vSphere. I’m sure prior to block level locking the NFS file level locking (assuming it is file level) is more efficient than LUN level. Though to be honest I’ve never encountered issues with SCSI reservations in the past few years I’ve been using VMFS. Probably because of how I use it. I don’t do a lot of activities that trigger reservations short of writing data.

Another graphic which I thought was kind of funny, is the current  Gartner group “magic quadrant”, someone posted a link to it for VMware in a somewhat recent post, myself I don’t rely on Gartner but I did find the lop sidedness of the situation for VMware quite amusing –

I’ve been using VMware since before 1.0, I still have my VMware 1.0.2 CD for Linux. I deployed VMware GSX to production for an e-commerce site in 2004, I’ve been using it for a while, I didn’t start using ESX until 3.0 came out(from what I’ve read about the capabiltiies of previous versions I’m kinda glad I skipped them 🙂 ). It’s got to be the most solid piece of software I’ve ever used, besides Oracle I suppose. I mean I really, honestly can not remember it ever crashing. I’m sure it has, but it’s been so rare that I have no memory of it. It’s not flawless by any means, but it’s solid. And VMware has done a lot to build up my loyalty to them over the past, what is it now eleven years? Like most everyone else at the time, I had no idea that we’d be doing the stuff with virtualization today that we are back then.

I’ve kept my eyes on other hypervisors as they come around, though even now none of the rest look very compelling. About two and a half years ago my new boss at the time was wanting to cut costs, and was trying to pressure me into trying the “free” Xen that came with CentOS at the time. He figured a hypervisor is a hypervisor. Well it’s not. I refused. Eventually I left the company and my two esteemed colleges were forced into trying it after I left(hey Dave and Tycen!) they worked on it for a month before giving up and going back to VMware. What a waste of time..

I remember Tycen at about the same time being pretty excited about Hyper-V. Well at a position he recently held he got to see Hyper-V in all it’s glory, and well he was happy to get out of that position and not having to use Hyper-V anymore.

Though I do think KVM has a chance, I think it’s too early to use it for anything too serious at this point, though I’m sure that’s not stopping tons of people from doing it anyways, just like it didn’t stop me from running production on GSX way back when. But I suspect by the time vSphere 5.0 comes out, which I’m just guessing here will be in the 2012 time frame, KVM as a hypervisor will be solid enough to use in a serious capacity. VMware will of course have a massive edge on management tools and fancy add ons, but not everyone needs all that stuff (me included). I’m perfectly happy with just vSphere and vCenter (be even happier if there was a Linux version of course).

I can’t help but laugh at the grand claims Red Hat is making for KVM scalability though. Sorry I just don’t buy that the Linux kernel itself can reach such heights and be solid & scalable, yet alone a hypervisor running on top of Linux (and before anyone asks, NO ESX does NOT run on Linux).

I love Linux, I use it every day on my servers and my desktops and laptops, have been for more than a decade. Despite all the defectors to the Mac platform I still use Linux 🙂 (I actually honestly tried a MacBook Pro for a couple weeks recently and just couldn’t get it to a usable state).

Just because the system boots with X number of CPUs and X amount of memory doesn’t mean it’s going to be able to effectively scale to use it right. I’m sure Linux will get there some day, but believe it is a ways off.

September 2, 2010

Dell concedes to HP

Filed under: News,Storage — Tags: — Nate @ 8:04 am

It’s over. Dell has said it will not raise it’s offer any more.

Dell Inc. says it will not match Palo Alto-based Hewlett-Packard’s offer to pay $33 per share for 3Par Inc., or about $2.07 billion.

Probably will write more later 🙂 Been a busy morning.

Dell’s last stand

Filed under: News,Storage — Tags: — Nate @ 6:29 am

So apparently the news is official, 3PAR has determined the new $33/share bid is superior. Dell seems to be conceding defeat at this point. Apparently as part of Dell’s recent $32/share increased bid they also negotiated a long term reseller agreement that would somehow continue even if HP ends up buying 3PAR.

From 3PAR

HP’s revised proposal of $33 per share values 3PAR at approximately $2.4 billion

Although 3PAR previously notified Dell of its intention to terminate its merger agreement with Dell, the merger agreement was not terminated and remains in full force and effect. Following 3PAR’s notice of intent to terminate the merger agreement, and prior to receiving HP’s revised acquisition proposal, 3PAR received a revised acquisition proposal from Dell in which Dell increased its offer price from $27 per share to $32 per share. Dell’s revised acquisition proposal also included an increased termination fee of $92 million payable by 3PAR to Dell as a condition to accepting a “superior proposal,” and a multi-year reseller agreement with Dell, which would by its terms be assumed by an acquirer of, or successor in interest to, 3PAR in the event of a change in control of 3PAR (including the acquisition of 3PAR by HP or another third party), and which contained fixed pricing and other terms that the 3PAR board of directors determined to be unacceptable.

So it sounds given the length of time that elapsed for Dell to get this new deal done and how decisive HP has been, Dell likely won’t come back again, and will instead rely on the reseller agreement to get 3PAR technology on the side. Interesting strategy,

I wonder if HP will try to terminate that, even if it means going to court just to block Dell from capitalizing on their pending investment. I would put money down that they will.

If they don’t I wonder how it will make Dell’s customers feel buying HP product from Dell? I mean with all of the sparkling HP logos plastered all over it.

I also believe Dell is putting the final nails in the coffin with their partnership with EMC with this move. EMC has a lot to lose if both HP and Dell are pitching 3PAR technology to their respective customers.

Just goes to show the value that 3PAR brings to the table.

(edited to strike out references to the reseller agreement since I obviously read too quickly before posting, just shows how excited I am I guess!! (not uncommon!) )

You will respect my authoritah!

Filed under: Storage — Tags: — Nate @ 6:18 am

The Register has an interesting angle on the bidding war for 3PAR from the HP side –

These technology advances should make enhanced sales of 3PAR systems more justifiable, enabling HP to recoup its $2bn investment by increasing InServ sales against EMC, HDS and IBM competition. Donatelli will be able to dangle his 3PAR prize in front of HP’s board and assert his credentials to be the next HP CEO, having demonstrated, he might say, authority, decisiveness, strategic thinking, determination and effectiveness, without over-paying for the 3PAR asset.

HP now offernig $33/share for 3PAR

Filed under: News,Storage — Tags: — Nate @ 6:04 am

Not much details  yet, just notice that HP has upped it’s bid to $33/share for 3PAR a few minutes ago. The front page of the Wall Street Journal has about all I’ve heard from CNBC

Hewlett-Packard has raised its bid for 3PAR to $33 a share; Dell also offered a higher price and negotiated a higher breakup fee

What it seems like is at the last minute Dell finally came through with something around $30/share, sounds like they really struggled to get that one through. HP of course being decisive came back immediately with $33/share.

Here is another article that says the reason why the bidding is so intensive is 3PAR is the only game in town, there is no room for second best –

Looking at the landscape, 3Par is the only real alternative to EMC and Hitachi in terms of high end storage.  EMC has its own ambitions for data center dominance, while HDS is part of a much larger conglomerate.  If you believe you need to own storage and server, both to fulfill the vision above and to avoid partnering with a competitor, than 3Par is the only place to get this type of deep high end storage technology.  Given HP and Dell have a much larger sales channel than 3Par, these guys can immediately double, triple or quadruple sales from 3Par products overnight once it is part of their catalogue.  Both reasons afford the premium we are seeing.

August 30, 2010

Dell getting cold feet

Filed under: News,Storage — Tags: — Nate @ 7:59 am

3PAR announced today:

3PAR® (NYSE: PAR), the leading global provider of utility storage, today announced its board of directors has determined that the unsolicited proposal by Hewlett-Packard Company to acquire all of 3PAR’s outstanding common stock at $30 per share constitutes a “superior proposal” (as that term is defined in 3PAR’s previously announced merger agreement with Dell). The 3PAR board of directors notified Dell of its intention to terminate the merger agreement with Dell, immediately following the expiration of the three business day period contemplated by, and the satisfaction of the other conditions set forth in, the merger agreement with Dell, in order to enter into the merger agreement with HP on the terms set forth in HP’s acquisition proposal.

CNBC looked at a couple of past storage deals to compare the valuations of them vs the current deal:

  • HP’s latest bid is 8.5 times 3PAR’s current projected revenue, 10 times last year’s revenue
  • Dell paid 10 times revenue for Equallogic back in 2007; valuation now looks smart
  • EMC paid 8 times revenue for Data Domain last year  (too early to tell how it’s working out according to CNBC)

Tick, tock Dell. Throw in the towel go after Compellent or Pillar or maybe even Xiotech.

Looks like 3PAR announced a pretty big deal which has 3cV in it, expecting a lot more in the future!

With this new partnership, Nissho adds a disaster recovery (DR) solution to its enhanced service offerings, which currently include public cloud development and a private cloud environment service based on 3cV. “3cV” is a proven blueprint for the virtual datacenter featuring the combination of 3PAR Utility Storage, HP® BladeSystem c-Class Server Blades, and VMware vSphere™. This solution is designed to enable improved server efficiency and to enhance service levels in private cloud datacenters. All the cloud service-focused products that Nissho offers, including those based on 3cV, are available at the company’s CloudNagivate Center, Nissho’s private technology verification center where customers can verify the operation and performance of a cloud-based infrastructure built on 3PAR technology.

Dell simply doesn’t have an answer to HP’s c Class blades.

August 27, 2010

CNBC Videos on 3PAR

Filed under: News,Storage — Tags: — Nate @ 12:32 pm

I’ve watched CNBC for a long time, I find it pretty entertaining, even though I don’t invest.

So often these mergers come about usually about industries and companies I have no interest in and can’t really gauge whether the analysts know what they are talking about.

This one is different of course as a user of 3PAR products for the past 3 years or so I know their stuff inside and out. And I’m constantly looking out for other interesting technologies.

Here’s several videos

HP Now offering $2 billion

Filed under: News,Storage — Tags: — Nate @ 8:46 am

Dell apaprently is being a little bitch again and matched HP’s $27 offer for 3PAR, so HP came right back and offered $30 a share, or $2 billion, up from the $1.1 billion original offer from Dell ($18/share).

PALO ALTO, Calif., Aug 27, 2010 (BUSINESS WIRE) — HP /quotes/comstock/13*!hpq/quotes/nls/hpq (HPQ 37.66, -0.56, -1.47%) today announced that it has increased its proposal to acquire all of the outstanding shares of 3PAR Inc. /quotes/comstock/13*!par/quotes/nls/par (PAR 31.71, +5.68, +21.82%) to $30 per share in cash, or an enterprise value of $2.0 billion. The proposal represents an 11 percent premium above the most recent price offered by Dell Inc. of $27 per share. HP’s proposal is not subject to any financing contingency and has been approved by HP’s board of directors. Once approved by 3PAR’s board, HP expects the transaction to close by the end of the calendar year.

Cut your losses and run Dell. Go buy Compellent.

« Newer PostsOlder Posts »

Powered by WordPress