TechOpsGuys.com Diggin' technology every day

August 28, 2009

vSphere Storage vMotion for free

Filed under: Storage,Virtualization — Tags: , , — Nate @ 2:47 pm

OK this is slightly obvious but it may not be to everyone.  Among my favorite new abiltiies in vSphere is the evaluation mode. In ESX 3.x, this evaluation mode was fairly locked down, in order to do anything you had to have a paid license, or a VAR that could give you a temporary license. But not with vSphere.

My company went through a storage array migration earlier this year, I had initially started deploying VMs on top of NFS last year on the previous array, when we got the new array all new VMs went to it directly, the old VMs hung around. My plan was basically to re-install them from scratch onto the new array so I could take advantage of the thin provisoning, I could save upwards of 90% of my space with thin provisioning so I didn’t just want to copy the data files over to the VMFS volume(our Thin provisioning is dedicate on write). With our new array came a pair of NAS heads from another company, in order to evacuate the old array I did move those data files over to the NFS side of the storage system as a holding area until I could get the time to re-install them onto VMFS volumes.

Then vSphere came out and the clouds parted. The evaluation mode was fully unlocked, every feature(that I know of) was available for use free for 60 days. After a few fairly quick tests I started migrating my production hosts as quickly as I could to them even before I had my replacement license keys, since I had 60 days to get them.  I setup an evaluation copy of vCenter, and hooked everything up. My first real exposure to vCenter. And I took the opportunity to use the free storage vMotion to migrate those VMs from the NFS data store to the VMFS data store in a “thin” way.

I don’t anticipate having or needing to use Storage vMotion often, but it’s nice to know that if I do need it, I can just fire up a new ESX system under a evaluation license, do my Storage vMotions to my heart’s content and then shut the new box down again. Since all of my systems boot from SAN, I could even do the same in-place, evacuate one system, unmap the original LUN, create a new LUN, install ESX on it, do the basics to get it configured, do the vMotions that I need, and then reboot the host, remove the new LUN, re-instate the old LUN and off we go again. Quite a few more steps but certainly worth it for me, if I only think I will need it once or twice per year.

We bought vCenter standard edition not too long ago, and I still have a couple vSphere hosts running in evaluation mode even now, in case I want to play with any of the more advanced features, I’ve done a few vMotions and stuff to shift VMs back onto freshly installed vSphere hosts. I only have 1 production ESX 3.5 system left, and it will stay 3.5 for a while at least because of the snapshot issues in vSphere.

Personally my four favorite things in vSphere are: Round Robin MPIO, ESXi boot from SAN, the Essentials license pricing, and the expanded evaluation functionality.

I really didn’t have much interest in most of the other things that VMware has been touting, I have simple requirements. And perhaps as a side effect I’ve had fewer problems then most serious ESX users I’ve talked with, have heard lots of horror stories about snapshots and VCB, and Dave here has even had some issues with zombie VMs and vMotion, others having issues with SCSI reservations etc.  I haven’t had any of that I guess because I almost never use any of that functionality. The core stuff is pretty solid, still have yet to see a system crash even. Good design? Luck? both? Other?

August 25, 2009

Cheap vSphere installation managable by vCenter

Filed under: Virtualization — Tags: , , — Nate @ 4:53 pm

UPDATED – I don’t mean to turn this into a Vmware blog or a storage blog as those have been almost all of my posts so far, but as someone who works for a company that hasn’t yet invested too much in vmware(was hard enough to convince them to buy any VM solution management wanted the free stuff), I wanted to point out that you can get the “basics” of vSphere in the vSphere essentials pack, what used to cost about $3k now is about $999, and support is optional. Not only that but at least in my testing a system running a “essentials” license is fully able to connect and be managed by a vCenter “standard” edition system.

I just wanted to point it out because when I proposed this scenario a month or so ago to my VAR they wanted to call VMware to talk to see if there were any gotchas, and the initial VMWare rep we talked to couldn’t find anything that said you could or could not do this specifically but didn’t believe there was anything in the product that would block you from managing an “essentials” vSphere host with a “Standard” vCenter server. But he spent what seemed like a week trying to track down a real answer but never got back to us. Then we called in again and got another person who said something similar, he couldn’t find anything that would prevent it, but apparently it’s not something that has been proposed too widely before. The quote I got from the VAR who was still confused had a note saying that you could not do what I wanted to do, but it does work. Yes we basically throw out the “free” vCenter “foundation” edition, but it’s still a lot cheaper then going with vSphere standard:

vSphere Essentials 6 CPUs = Year 1 – $999 with 1 year subscription, support on per incident basis

vSphere Standard 6 CPUs = Year 1 – $6,408 with 1 year subscription, and gold support

Unless you expect to file a lot of support requests that is.

It is true that you get a few extra things with vSphere Standard over Essentials such as “Thin provisioning” and “High availability”. In my case thin provisioning is built into the storage, so I don’t need that. And High availability isn’t that important either as for most things we have more than 1 VM running the app and load balance using real load balancers for fault tolerance(there are exceptions like DB servers etc).

Something that is kind of interesting is that the “free” vSphere supports thin provisioning, I have 11 hosts running that version with local storage at remote sites. Odd that they throw in that with the free license but not with essentials!

The main reason for going this route to me at least is at least you can have a real vCenter server and your systems managed by it, have read-write access to the remote APIs, and of course have the option of running the full hefty ESX instead of using the “thin” ESXI. Myself I prefer the big service console, I know it’s going away at some point, but I’ll use it while it’s there. I have plenty of memory to spare. A good chunk of my production ESX infrastructure is older re-purposed HP DL585G1s with 64GB of memory, they are quad processor, dual core, which makes this licensing option even more attractive for them.

My next goal is to upgrade the infrastructure to HP c-Class blades with either 6 core Opterons or perhaps 12 core when they are out(assuming availability for 2 socket systems), 64GB of memory(the latest HP Istanbul blades have 16 memory slots), 10GbE VirtualConnect and 4Gbps Fiber VirtualConnect, and upgrade to vSphere advanced.  That’ll be sometime in 2010 though. There’s no software “upgrade” path from essentials to advanced, so I’ll just re-purpose essentials to other systems, I have at least 46 sockets in servers running the “free” license as is.

(I still remember how happy I was to pay the $3500 for two socket fee a couple of years ago for ESX “Standard” edition, now it’s about 90% less on a per-socket basis for the same abilities)

UPDATE – I haven’t done extensive testing yet but during my quick tests before a more recent entry that I posted I wanted to check to see if Essentials could/would boot a VM that was thin provisioned. Since I used storage vMotion to move some VMs over, that would be annoying if it could not. And it just so happens that I already have a VM running on my one Essentials ESX host that is thin provisioned! So it appears the license just limits you on the creation of thinly provisioned virtual disks, not the usage of them, which makes sense. It would be an Oracle-like tactic to do the former. And yes I did power off the VM and power it back on today to verify.  But that’s not all – I noticed what seems to be a loop hole in vSphere’s licensing, I mention above that vSphere Essentials does not support thin provisioning, as you can see here in their pricing PDF(and there is no mention of the option in the License configuration page on the host). When I create VMs I always use the Custom option, rather than use the Typical configuration. Anyways I found out that if you use Typical when creating a VM with the Essentials license you CAN USE THIN PROVISIONING. I created the disk, enabled the option, and even started the VM (didn’t go beyond that). If you use Custom the Thin Provisioning option flat out isn’t even offered. I wasn’t expecting the VM to be able to power on. I recall testing another unrelated but still premium option, forgot which one, and when I tried to either save the configuration or power up the VM the system stopped me saying the license didn’t permit that.

August 19, 2009

Does size matter?

Filed under: Storage,Virtualization — Tags: , , — Nate @ 10:30 am

UPDATED – I’ve been a fan of VMware for what seems like more than a decade, still have my VMware 1.0.2 for Linux CD even. I just wanted to dispel a myth that ESXi has a small disk footprint. On VMware’s own site they mention the footprint being 32MB. I believe I saw another number in the ~75MB range or something at a vSphere launch event I attended a few months ago.

Not that it’s a big deal to me but it annoys me when companies spout bullshit like that. I just wanted to dispel the myth that ESXi has a small disk foot print. My storage array has thin provisoning technology and dedicates data in 16kB increments as it is written. So I can get a clear view on how big ESXi actually is.

And the number is: ~900 Megabytes for ESXi v4. I confronted a VMware rep on this number at that event I mentioned earlier and he brushed me off, saying the extra space was other required components not just the hypervisor. In the link above they compare against MS Hyper-V, they take MS’s “full stack” and perhaps compare it to their “hypervisor”(which by itself is unusuable, you need those other required components), hence my claim that their claim is a complete and totally bullshit number.

This is significantly smaller than the full ESX, which from the range of systems I have installed uses between 3-5 Gigabytes. When I was setting up the network installer for ESX I believe it required at least 25GB for vSphere, which is slightly more than ESX 3.5.  Again with the array technology despite me allocating 25GB worth of data to the volume, vSphere has only written between 3-5GB of it, so that is all that is used. But in both cases I get accurate representations of how much real space each system requires.

ESXi v3.5 was unable to boot directly from SAN so I can’t tell with the same level of accuracy how big it is, (“df” says about 200MB) but I can say that our ESXi v3.5 systems are installed on 1GB USB sticks, and the image I decompressed onto those USB sticks is 750MB(VMware-VMvisor-big-3.5.0_Update_4-153875.i386.dd), regardless, it’s FAR from 32MB or even 75MB, at best it’s 10x larger than what they claim.

So let this one rest VMWare, give it up, stop telling people ESXi has such a tiny disk footprint, because it’s NOT TRUE.

You can pry vmware from my cold dead hands, but I still want to dispel this myth on ESXi’s size.

UPDATED – I went back to my storage array again, and found something that didn’t make sense, it’s pretty heavily virtualized itself, but after consulting with the vendor it turns out the volume is in fact 900MB of written space, rather than 1.5GB that I originally posted, if you really want to know I could share the details but I don’t think that’s too important, and without knowing the terminology of their technology it wouldn’t make much sense to anyone anyways!

The first comment I got(thanks!) mentions a significant difference in size between the embedded version of ESXi and the installable(what I’m using). This could be where the confusion lies, I have not used any systems with the embedded ESXi yet(my company is mostly a Dell shop and they charge a significant premimum for the embedded ESXi and force you on a high end support contract so we decided to install it ourselves for free).

August 18, 2009

It’s not a bug, it’s a feature!

Filed under: Storage,Uncategorized,Virtualization — Tags: , , — Nate @ 5:01 pm

I must be among a tiny minority of people who have automated database snapshots moving between systems on a SAN.

Earlier this year I setup an automated snapshot process to snapshot a production  MySQL database and bring it over to QA. This runs every day, and runs fine as-is. There is another on-demand process to copy byte-for-byte the same production MySQL DB to another QA mysql server(typically run once every month or two, and runs fine too!).

I also setup a job to snapshot all of the production MySQL DBs(3 currently), and bring them to a dedicated “backup” VM which then backs up the data and compresses it onto our NFS cluster. This runs every day, and runs fine as-is.

ENTER VMWARE VSPHERE.

Apparently they introduced new “intelligence” in vSphere in the storage system that tries to be smarter about what storage devices are present. This totally breaks these automated processes. Because the data on the LUN is different after I remove the LUN, delete the snapshot, create a new one, and re-present the LUN to vSphere it says HEY THERE IS DIFFERENT DATA SO I’LL GIVE IT A UNIQUE UUID (Nevermind the fact that it is the SAME LUN). During that process the guest VM loses connectivity to the original storage(of course) and does not regain connectivity because VSPHERE thinks the LUN is different so doesn’t give the VM access to it. The only fix at that point is to power off the VM, delete all of the Raw device maps, re-create all of the raw device maps and then power on the VM again. @#)!#$ No you can’t gracefully halt the guest OS because there are missing LUNs and the guest will hang on shutdown.

So I filed a ticket with vmware, the support team worked on it for a couple of weeks, escalating it everywhere, but as far as anyone could tell it’s “doing what it’s supposed to do”. And they can’t imagine how this process works in ESX 3.5 except for the fact that ESX 3.5 was more “dumb” when it came to this sort of thing.

ITS RAW FOR A REASON, DON’T TRY TO BE SMART WHEN IT COMES TO A RAW DEVICE MAP, THAT’S WHY IT’S RAW.

http://www.vmware.com/pdf/esx25_rawdevicemapping.pdf

With ESX Server 2.5, VMware is encouraging the use of raw device mapping in the following
situations:
• When SAN snapshot or other layered applications are run in the virtual machine. Raw
device mapping better enables scalable backup offloading systems using the features
inherent to the SAN.

[..]

HELLO ! SAN USER HERE TRYING TO OFFLOAD BACKUPS!

Anyways there are a few workarounds for these processes going forward:
– Migrate these LUNs to use Software iSCSI instead of Fiber channel, there is a performance hit(not sure how much)
– Keep one/more ESX 3.5 systems around for this type of work
– Use physical servers for things that need automated snapshots

The VMWare support rep sounded about as frustrated with the situation as I was/am. He did appear to try his best, but this behavior by vSphere is just unacceptable.  After all it works flawlessly in ESX 3.5!

WAIT! This broken-ness extends to NFS as well!

I filed another support request on a kinda-sorta-similar issue a couple of weeks ago regarding NFS data stores. Our NFS cluster operates with multiple IP addresses. Many(all?) active-active NFS clusters have at least two IPs (one per controller). In vSphere it once again assigns a unique ID based on the IP address rather than the host name to identify the NFS system. As a result if I use the host name on multiple ESX servers there is a very high likelihood(pretty much guaranteed) that I will not be able to do a migration of a VM that is on NFS from one host to another, because vSphere identifies the volumes differently because they are accessing it via a different IP. And if I try to rename the volume to match what is on the other system it tells me there is already a volume named that(when there is not) so I cannot rename it. The only workaround is to hard code the IP to each host, which is not a good solution because you lose multi-node load balancing at that point. Fortunately I have a Fiber channel SAN as well and have migrated all of my VMs off of NFS onto Fiber Channel, so this particular issue doesn’t impact me. But I wanted to illustrate this same sort of behavior with UUIDs is not unique to SAN, it can easily affect NAS as well.

You may not be impacted by the NFS stuff if your NFS system is unable to serve out the same file system over multiple controller systems simultaneously. I believe most fall into this category of being limited to 1 file system per controller at any given point in time. Our NFS cluster does not have this limitation.

« Newer Posts

Powered by WordPress