TechOpsGuys.com Diggin' technology every day

28Feb/10Off

VMware dream machine

TechOps Guy: Nate

(Originally titled fourty eight all round, I like VMware dream machine more)

UPDATED I was thinking more about the upcoming 12-core Opterons and the next generation of HP c Class blades, and thought of a pretty cool configuration to have, hopefully it becomes available.

Imagine a full height blade that is quad socket, 48 cores (91-115Ghz), 48 DIMMs (192GB with 4GB sticks), 4x10Gbps Ethernet links and 2x4Gbps fiber channel links (total of 48Gbps of full duplex bandwidth). The new Opterons support 12 DIMMs per socket, allowing the 48 DIMM slots.

Why 4x10Gbps links? Well I was thinking why not.. with full height blades you can only fit 8 blades in a c7000 chassis. If you put a pair of 2x10Gbps switches in that gives you 16 ports. It's not much more $$ to double up on 10Gbps ports. Especially if your talking about spending upwards of say $20k on the blade(guesstimate) and another $9-15k blade on vSphere software per blade. And 4x10Gbps links gives you up to 16 virtual NICs using VirtualConnect per blade, each of them adjustable in 100Mbps increments.

Also given the fact that it is a full height blade, you have access to two slots worth of I/O, which translates into 320Gbps of full duplex fabric available to a single blade.

That kind of blade ought to handle just about anything you can throw at it. It's practically a super computer in of itself. Right now HP holds the top spot for VMark scores, with a 8 socket 6 core system(48 total cores) out pacing even a 16 socket 4 core system(64 total cores).

The 48 CPU cores will give the hypervisor an amazing number of combinations for scheduling vCPUs. Here's a slide from a presentation I was at last year which illustrates the concept behind the hypervisor scheduling single and multi vCPU VMs:

There is a PDF out there from VMware that talks about the math formulas behind it all, it has some interesting commentary on CPU scheduling with hypervisors:

[..]Extending this principle, ESX Server installations with a greater number of physical CPUs offer a greater chance of servicing competing workloads optimally. The chance that the scheduler can find room for a particular workload without much reshuffling of virtual machines will always be better when the scheduler has more CPUs across which it can search for idle time.

This is even cooler though, honestly I can't pretend to understand the math myself! -

Scheduling a two-VCPU machine on a two-way physical ESX Server hosts provides only one possible allocation for scheduling the virtual machine. The number of possible scheduling opportunities for a two-VCPU machine on a four-way or eight-way physical ESX Server host is described by combinatorial mathematics using the formula N! / (R!(N-R)!) where N=the number of physical CPUs on the ESX Server host and R=the number of VCPUs on the machine being scheduled.1 A two-VCPU virtual machine running on a four-way ESX Server host provides (4! / (2! (4-2)!) which is (4*3*2 / (2*2)) or 6 scheduling possibilities. For those unfamiliar with combinatory mathematics, X! is calculated as X(X-1)(X-2)(X-3)…. (X- (X-1)). For example 5! = 5*4*3*2*1.

Using these calculations, a two-VCPU virtual machine on an eight-way ESX Server host has (8! / (2! (8-2)!) which is (40320 / (2*720)) or 28 scheduling possibilities. This is more than four times the possibilities a four-way ESX Server host can provide. Four-vCPU machines demonstrate this principle even more forcefully. A four-vCPU machine scheduled on a four-way physical ESX Server host provides only one possibility to the scheduler whereas a four-VCPU virtual machine on an eight-CPU ESX Server host will yield (8! / (4!(8-4)!) or 70 scheduling possibilities, but running a four-vCPU machine on a sixteen-way ESX Server host will yield (16! / (4!(16-4)!) which is (20922789888000 / ( 24*479001600) or 1820 scheduling possibilities. That means that the scheduler has 1820 unique ways in which it can place the four-vCPU workload on the ESX Server host. Doubling the physical CPU count from eight to sixteen results in 26 times the scheduling flexibility for the four-way virtual machines. Running a four-way virtual machine on a Host with four times the number of physical processors (16-way ESX Server host) provides over six times more flexibility than we saw with running a two-way VM on a Host with four times the number of physical processors (8-way ESX Server host).

Anyone want to try to extrapolate that and extend it to a 48-core system? :)

It seems like only yesterday that I was building DL380G5 ESX 3.5 systems with 8 CPU cores and 32GB of ram, with 8x1Gbps links thinking of how powerful they were. This would be six of those in a single blade. And only seems like a couple weeks ago I was building VMware GSX systems with dual socket single core systems and 16GB ram..

So, HP do me a favor and make a G7 blade that can do this, that would make my day! I know fitting all of those components on a single full height blade won't be easy. Looking at the existing  BL685c blade, it looks like they could do it, remove the internal disks(who needs em, boot from SAN or something), and put an extra 16 DIMMs for a total of 48.

I thought about using 8Gbps fiber channel but then it wouldn't be 48 all round :)

UPDATE Again I was thinking about this and wanted to compare the costs vs existing technology. I'm estimating roughly a $32,000 price tag for this kind of blade and vSphere Advanced licensing (note you cannot use Enterprise licensing on a 12-core CPU, hardware pricing extrapolated from existing HP BL685G6 quad socket 6 core blade system with 128GB ram). The approximate price of an 8-way 48-core HP DL785 with 192GB, 4x10GbE and 2x4Gb Fiber with vSphere licensing comes to about roughly $70,000 (because VMWare charges on a per socket basis the licensing costs go up fast). Not only that but you can only fit 6 of these DL785 servers in a 42U rack, and you can fit 32 of these blades in the same rack with room to spare. So less than half the cost, and 5 times the density(for the same configuration). The DL785 has an edge in memory slot capacity, which isn't surprising given its massive size, it can fit 64 DIMMs vs 48 on my VMware dream machine blade.

Compared to a trio of HP BL495c blades each with 12 cores, and 64GB of memory, approximate pricing for that plus advanced vSphere is $31,000 for a total of 36 cores and 192GB of memory. So for $1,000 more you can add an extra 12 cores, cut your server count by 66%, probably cut your power usage by some amount and improve consolidation ratios.

So to summarize, two big reasons for this type of solution are:

  • More efficient consolidation on a per-host basis by having less "stranded" resources
  • More efficient consolidation on a per-cluster basis because you can get more capacity in the 32-node limit of a VMware cluster(assuming you want to build a cluster that big..) Again addressing the "stranded capacity" issue. Imagine what a resource pool could do with 3.3 Thz of compute capacity and 9.2TB of memory? All with line rate 40Gbps networking throughout? All within a single cabinet ?

Pretty amazing stuff to me anyways.

[For reference - Enterprise Plus licensing would add an extra $1250/socket plus more in support fees. VMware support costs not included in above pricing.]

END UPDATE