TechOpsGuys.com Diggin' technology every day

7Sep/10Off

vSphere VAAI only in the Enterprise

TechOps Guy: Nate

Beam me up!

Damn those folks at VMware..

Anyways I was browsing around this afternoon looking around at things and while I suppose I shouldn't be I was surprised to see that the new storage VAAI APIs are only available to people running Enterprise or Enterprise Plus licensing.

I think at least the block level hardware based locking for VMFS should be available to all versions of vSphere, after all VMware is offloading the work to a 3rd party product!

VAAI certainly looks like it offers some really useful capabiltiies, from the documentation on the 3PAR VAAI plugin (which is free) here are the highlights:

  • Hardware Assisted Locking is a new VMware vSphere storage feature designed to significantly reduce impediments to VM reliability and performance by locking storage at the block level instead of the logical unit number (LUN) level, which dramatically reduces SCSI reservation contentions. This new capability enables greater VM scalability without compromising performance or reliability. In addition, with the 3PAR Gen3 ASIC, metadata comparisons are executed in silicon, further improving performance in the largest, most demanding VMware vSphere and desktop virtualization environments.
  • The 3PAR Plug-In for VAAI works with the new VMware vSphere Block Zero feature to offload large, block-level write operations of zeros from virtual servers to the InServ array, boosting efficiency during several common VMware vSphere operations— including provisioning VMs from Templates and allocating new file blocks for thin provisioned virtual disks. Adding further efficiency benefits, the 3PAR Gen3 ASIC with built-in zero-detection capability prevents the bulk zero writes from ever being written to disk, so no actual space is allocated. As a result, with the 3PAR Plug-In for VAAI and the 3PAR Gen3 ASIC, these repetitive write operations now have “zero cost” to valuable server, storage, and network resources—enabling organizations to increase both VM density and performance.
  • The 3PAR Plug-In for VAAI adds support for the new VMware vSphere Full Copy feature to dramatically improve the agility of enterprise and cloud datacenters by enabling rapid VM deployment, expedited cloning, and faster Storage vMotion operations. These administrative tasks are now performed in half the time. The 3PAR plug-in not only leverages the built-in performance and efficiency advantages of the InServ platform, but also frees up critical physical server and network resources. With the use of 3PAR Thin Persistence and the 3PAR Gen3 ASIC to remove duplicated zeroed data, data copies become more efficient as well.

Cool stuff. I'll tell you what. I really never had all that much interest in storage until I started using 3PAR about 3 and a half years ago. I mean I've spread my skills pretty broadly over the past decade, and I only have so much time to do stuff.

About five years ago some co-workers tried to get me excited about NetApp, though for some reason I never could get too excited about their stuff, sure it has tons of features which is nice, though the core architectural limitations of the platform (from a spinning rust perspective at least) I guess is what kept me away from them for the most part. If you really like NetApp, put a V-series in front of a 3PAR and watch it scream. I know of a few 3PAR/NetApp users that are outright refusing to entertain the option of running NetApp storage, they like the NAS, and keep the V-series but the back end doesn't perform.

On the topic of VMFS locking - I keep seeing folks pimping the NFS route attack the VMFS locking as if there was no locking in NFS with vSphere. I'm sure prior to block level locking the NFS file level locking (assuming it is file level) is more efficient than LUN level. Though to be honest I've never encountered issues with SCSI reservations in the past few years I've been using VMFS. Probably because of how I use it. I don't do a lot of activities that trigger reservations short of writing data.

Another graphic which I thought was kind of funny, is the current  Gartner group "magic quadrant", someone posted a link to it for VMware in a somewhat recent post, myself I don't rely on Gartner but I did find the lop sidedness of the situation for VMware quite amusing -

I've been using VMware since before 1.0, I still have my VMware 1.0.2 CD for Linux. I deployed VMware GSX to production for an e-commerce site in 2004, I've been using it for a while, I didn't start using ESX until 3.0 came out(from what I've read about the capabiltiies of previous versions I'm kinda glad I skipped them :) ). It's got to be the most solid piece of software I've ever used, besides Oracle I suppose. I mean I really, honestly can not remember it ever crashing. I'm sure it has, but it's been so rare that I have no memory of it. It's not flawless by any means, but it's solid. And VMware has done a lot to build up my loyalty to them over the past, what is it now eleven years? Like most everyone else at the time, I had no idea that we'd be doing the stuff with virtualization today that we are back then.

I've kept my eyes on other hypervisors as they come around, though even now none of the rest look very compelling. About two and a half years ago my new boss at the time was wanting to cut costs, and was trying to pressure me into trying the "free" Xen that came with CentOS at the time. He figured a hypervisor is a hypervisor. Well it's not. I refused. Eventually I left the company and my two esteemed colleges were forced into trying it after I left(hey Dave and Tycen!) they worked on it for a month before giving up and going back to VMware. What a waste of time..

I remember Tycen at about the same time being pretty excited about Hyper-V. Well at a position he recently held he got to see Hyper-V in all it's glory, and well he was happy to get out of that position and not having to use Hyper-V anymore.

Though I do think KVM has a chance, I think it's too early to use it for anything too serious at this point, though I'm sure that's not stopping tons of people from doing it anyways, just like it didn't stop me from running production on GSX way back when. But I suspect by the time vSphere 5.0 comes out, which I'm just guessing here will be in the 2012 time frame, KVM as a hypervisor will be solid enough to use in a serious capacity. VMware will of course have a massive edge on management tools and fancy add ons, but not everyone needs all that stuff (me included). I'm perfectly happy with just vSphere and vCenter (be even happier if there was a Linux version of course).

I can't help but laugh at the grand claims Red Hat is making for KVM scalability though. Sorry I just don't buy that the Linux kernel itself can reach such heights and be solid & scalable, yet alone a hypervisor running on top of Linux (and before anyone asks, NO ESX does NOT run on Linux).

I love Linux, I use it every day on my servers and my desktops and laptops, have been for more than a decade. Despite all the defectors to the Mac platform I still use Linux :) (I actually honestly tried a MacBook Pro for a couple weeks recently and just couldn't get it to a usable state).

Just because the system boots with X number of CPUs and X amount of memory doesn't mean it's going to be able to effectively scale to use it right. I'm sure Linux will get there some day, but believe it is a ways off.

6Nov/09Off

Thin Provisioning strategy with VMware

TechOps Guy: Nate

Since the announcement of thin provisioning built into vSphere I have seen quite a few blog posts on how to take advantage of it but haven't seen anything that matches my strategy which has served me well utilizing array-based thin provisioning technology. I think it's pretty foolproof..

The man caveat is that I assume you have a decent amount of storage available on your system, that is your VMFS volumes aren't the only thing residing on your storage. On my current storage array,written VMFS data accounts for maybe 2-3 % of my storage. On the storage array I had at my last company it was probably 10-15%. I don't believe in dedicated storage arrays myself. I prefer nice shared storage systems that can sustain random and sequential I/O from any number of hosts and distributed that I/O across all of the resources for maximum efficiency.  So my current array has most of it's space set aside for a NFS cluster, and then there is a couple dozen terabytes set aside for SQL servers and VMware. The main key is being able to share the same spindles across dozens or even hundreds of LUNs.

There has been a lot of debate over the recent years about how best to size your VMFS volumes. The most recent data I have seen suggests somewhere between 250GB and 500GB. There seems to be unanimous opinion out there not to do something crazy and use 2TB volumes. The exact size depends on your setup. How many VMs, how many hosts, how often you use snapshots, how often you do vMotion, as well as the amount of I/O that goes on. The less of all of those the larger the volume can potentially be.

My method is so simple. I chose 1TB as my volume sizes, thin provisioned of course.  I utilize the default lazy zero VMFS mode and do not explicitly turn on thin provisioning on any VMDK files. There's no real point if you already have it in the array. So I create 1TB volumes, and I begin creating VMs on them. I try to stop when I get to around 500GB of allocated(but not written) space. That is VMware thinks it is using 500GB, but it may only be using 30GB. This way I know, the system will never use more than 500GB. Pretty simple. Of course I have enough space in reserve that if something crazy were to happen the volume could grow to 500GB and not cause any problems. Even with my current storage array operating in the neighborhood of 89% of total capacity, that still leaves me with several terabytes of space I can use in an emergency.

If I so desire I can go beyond the 500GB at any time without an issue. If I chose not to then I haven't wasted any space because nothing is written to those blocks. My thin provisioning system is licensed based on written data, so if I have 10TB of thin provisioning on my system I can, if I want create 100TB of thin provisioned volumes, provided I don't write more than 10TB to them. So you see there really is no loss in making a larger volume when the data is thin provisioned on the array. Why not make it 2TB or even bigger? Well really I can't see a time when I would EVER want a 2TB VMFS volume which is why I picked 1TB.

I took the time in my early days working with thin provisioning to learn the growth trends of various applications and how best to utilize them to get maximum gain out of thin provisioning.  With VMs that means having a small dedicated disk for OS and swap, and any data resides on other VMDKs or preferably on a NAS or for databases on raw devices(for snapshot purposes). Given that core OSs don't grow much there isn't much space needed(I default to 8GB) for the OS, and I give the OS a 1GB swap partition.  For additional VMDKs or raw devices I always use LVM. I use it to assist me in automatically detecting what devices a particular volume are on, I use it for naming purposes, and I use it to forcefully contain growth. Some applications are not thin provisioning friendly but I'd like to be able to expand the volume on demand without an outage. Online LVM resize and file system resize allows this without touching the array. It really doesn't take much work.

On my systems I don't really do vMotion(not licensed), I very rarely use VMFS snapshots(few times a year), the I/O on my VMFS volumes is tiny despite having 300+ VMs running on them. So in theory I probably could get away with 1TB or even 2TB VMFS volume sizes, but why lock myself into that if I don't have to? So I don't.

I also use dedicated swap VMFS volumes so I can monitor the amount of I/O going on with swap from an array perspective. Currently I have 21 VMware hosts connected to our array totalling 168 CPU cores, and 795GB of memory. Working to retire our main production VMware hosts, many of which are several years old(re-purposed from other applications). Now that I've proven how well it can work on existing hardware and the low cost version the company is ready to gear up a bit more and commit more resources to a more formalized deployment utilizing the latest hardware and software technology. You won't catch me using the enterprise plus or even the enterprise version of VMware though, cost/ benefit isn't there.