TechOpsGuys.com Diggin' technology every day

20Jul/11Off

I called it! – Force10 bought by Dell

TechOps Guy: Nate

Not that it matters to me too much either way but Dell just bought Force10. I called it! Well it matters to me in that I didn't want Dell near my Extreme Networks :)

It is kind of sad that Force10 was never able to pull off their IPO. I have heard that they have been losing quite a bit of talent recently, but don't know to what degree. It's also unfortunate they weren't able to fully capitalize on their early leadership in the 10 gigabit arena, Arista seems to be the new Force10 in some respects, though it wouldn't surprise me if they have a hard time growing too barring some next gen revolutionary product.

I wonder if anyone will scoop up BlueArc, they have been trying to IPO as well for a couple of years now, I'd be surprised if they can pull it off in this market.  They have good technology just a whole lot of debt. Though recently I read they started turning a profit..

 

Tagged as: , Comments Off
20Jul/11Off

VMware Licensing models

TechOps Guy: Nate

[ was originally combined with another post but I decided to split out ]

VMware has provided it's own analysis of their customers hardware deployments and telling folks that ~95% of their customers won't be impacted by the licensing changes. I feel pretty confident that most of those customers are likely massively under utilizing their hardware. I feel confident because I went through that phase as well. Very, very few workloads are truly cpu bound especially with 8-16+ cores per socket.

It wouldn't surprise me at all that many of those customers when they go to refresh their hardware change their strategy pretty dramatically - provided the licensing permits it. The new licensing makes me think we should bring back 4GB memory sticks and 1 GbE. It is very wasteful to assign 11 CPU licenses to a quad socket system with 512GB of memory, memory only licenses should be available at a significant discount over CPU+memory licenses at the absolute minimum. Not only that but large amounts of memory are actually affordable now. It's hard for me to imagine at least having a machine with a TB of memory in it for around $100k, it wasn't TOO long ago that it would of run you 10 times that.

And as to VMware's own claims that this new scheme will help align ANYTHING better, by using memory pools across the cluster - just keep this in mind. Before this change we didn't have to care about memory at all, whether we used 1% or 95%, whether some hosts used all of their ram and others used hardly any. It didn't matter. VMware is not making anything simpler. I read somewhere about them saying some crap about aligning more with IT as a service. Are you kidding me? How may buzz words do we need here?

The least VMware can do is license based on usage. Remember pay for what you use, not what you provision. When I say usage I mean actual usage. Not charging me for the memory my Linux systems are allocating towards (frequently) empty disk buffers (goes to the memory balloon argument). If I allocate 32GB of ram to a VM that is only using 1GB of memory I should be charged for 1GB, not 32GB. Using vSphere's own active memory monitor would be an OK start.

Want to align better and be more dynamic? align based on memory usage and CPU usage, let me run unlimited cores on the cluster and you can monitor actual usage on a per-socket basis, so if on average (say you can bill based on 95% similar to bandwidth) your using 40% of your CPU then you only need 40% licensing. I still much prefer the flat licensing model in almost any arrangement rather than usage based but if your going to make it usage based, really make it usage based.

Oh yeah - and forget about anything that charges you per VM too (hello SRM). That's another bogus licensing scheme. It goes completely against the trend of splitting workloads up into more isolated VMs and instead favors fewer much larger VMs that are doing a lot of things at the same time. Even on my own personal co-located ESXi server, I have 5 VMs on it, I could consolidate it to two and provide the similar end user services, but it's much cleaner to do it in 5 for my own sanity.

All of this new licensing stuff also makes me think back to a project I was working on about a year ago, trying to find some way of doing DR in the cloud, the ROI for doing it in house vs. any cloud on the market(looked at about 5 different ones at the time) was never more than 3 months. In one case the up front costs for the cloud was 4 times the cost for doing it internally. The hardware needs were modest in my opinion, with the physical hardware not even requiring two full racks of equipment. The #1 cost driver was memory, #2 was CPU, storage was a distant third assuming the storage that the providers spec'd could meet the IOPS and throughput requirements, storage came in at about 10-15% of the total cost of the cloud solution.

Since most of my VMware deployments have been in performance sensitive situations (lots of Java) I run the systems with zero swapping, everything in memory has to stay in physical ram.

20Jul/11Off

Cluster DRS

TechOps Guy: Nate

Given the recent price hikes that VMware is imposing on it's customers(because they aren't making enough money obviously) , and looking at the list of new things in vSphere 5 and being, well underwhelmed (compared to vSphere 4), I brain stormed a bit and thought about what kind of things I'd like to see VMware add.

VMware seems to be getting more aggressive in going after service providers (their early attempts haven't been successful, it seems they have less partners now than a year ago - btw I am a vCloud express end-user at the moment). An area that VMware has always struggled in is scalability in their clusters (granted such figures have not been released for vSphere 5 but I am not holding my breath for a 10-100x+ increase in scale)

Whether it's the number of virtual machines in a cluster, the number of nodes, the scalability of the VMFS file system itself (assuming that's what your using) etc.

For the most part of course, a cluster is like a management domain, which means it is, in a way a single point of failure. So it's pretty common for people to build multiple clusters when they have a decent number of systems, if someone has 32 servers, it is unlikely they are going to build a single 32-node cluster.

A feature I would like to see is Cluster DRS, and Cluster HA. Say for example you have several clusters, some clusters are very memory heavy for loading a couple hundred VMs/host(typically 4-8 socket with several hundred gigs of ram), others are compute heavy with very low cpu consolidation ratios (probably dual socket with 128GB or less of memory). Each cluster by itself is a stand alone cluster, but there is loose logic that binds them together to allow the seamless transport of VMs between clusters either for either load balancing or fault tolerance. Combine and extend regular DRS to span clusters, on top of that you may need to do transparent storage vMotion (if required) as well along with the possibility of mapping storage on the target host (on the fly) in order to move the VM over (the forthcoming storage federation technologies could really help make hypervisor life simpler here I think).

Maybe a lot of this could be done using yet another management cluster of some kind, a sort of independent proxy of things (running on independent hardware and perhaps even dedicated storage). In the unlikely event of a catastrophic cluster failure, the management cluster would pick up on this and move the VMs to other clusters and re start them (provided there is sufficient resources of course!). In very large environments it is not be possible to map everything to everywhere, which would require multiple storage vMotions in order to get the VM from the source to a destination that the target host can access - if this can be done at the storage layer via the block level replication stuff first introduced in VAAI that could of course greatly speed up what otherwise might be a lengthy process.

Since it is unlikely anyone is going to be able to build a single cluster with shared storage that spans a great many systems(100s+) and have it be bulletproof enough to provide 99.999% uptime, this kind of capability would be a stop gap, providing the flexibility and availability of a single massive cluster, while at the same time reducing the complexity in having to try to build software that can actually pull the impossible (or what seems impossible today) off.

On the topic of automated cross cluster migrations, having global spare hardware would be nice too, much like most storage arrays have global hot spares, which can be assigned to any degraded RAID group on the system regardless of what shelf it may reside on. Global spare servers would be shared across clusters, and assigned on demand. A high end VM host is likely to cost upwards of $50,000+ in hardware these days, multiply by X number of clusters and well.. you get the idea.

While I'm here, I might as well say I'd like the ability to hot remove memory, Hyper-V has dynamic memory which seems to provide this functionality. I'm sure the guest OSs would need to be re-worked a bit too in order to support this, since in the physical world it's not too common to need to yank live memory from a system. In the virtual world it can be very handy.

Oh and I won't forget - give us an ability to manually control the memory balloon.

Another area that could use some improvement is the vMotion compatibility, there is EVC, but last I read you still couldn't cross processor manufacturers when doing vMotion with EVC. KVM can apparently do it today.