TechOpsGuys.com Diggin' technology every day

11Nov/100

Extreme VMware

TechOps Guy: Nate

So I was browsing some of the headlines of the companies I follow during lunch and came across this article (seems available on many outlets), which I thought was cool.

I've known VMware has been a very big happy user of Extreme Networks gear for a good long time now though I wasn't aware of anything that was public about it, at least until today. It really makes me feel good that despite VMware's partnerships with EMC and NetApp that include Cisco networking gear, at the end of the day they chose not to run Cisco for their own business.

But going beyond even that it makes me feel good that politics didn't win out here, obviously the people running the network have a preference, and they were either able to fight, or didn't have to fight to get what they wanted. Given VMware is a big company and given their big relationship with Cisco I would kind of think that Cisco would try to muscle their way in. Many times they can succeed depending on the management at the client company, but fortunately for the likes of VMware they did not.

SYDNEY, November 12. Extreme Networks, Inc., (Nasdaq: EXTR) today announced that VMware, the global leader in virtualisation and cloud infrastructure, has deployed its innovative enterprise, data centre and Metro Ethernet networking solutions.

VMware’s network features over 50,000 Ethernet ports that deliver connectivity to its engineering lab and supports the IT infrastructure team for its converged voice implementation.

Extreme Networks met VMware’s demanding requirements for highly resilient and scalable network connectivity. Today, VMware’s thousands of employees across multiple campuses are served by Extreme Networks’ leading Ethernet switching solutions featuring 10 Gigabit Ethernet, Gigabit Ethernet and Fast Ethernet, all powered by the ExtremeXOS® modular operating system.

[..]

“We required a robust, feature rich and energy efficient network to handle our data, virtualised applications and converged voice, and we achieved this through a trusted vendor like Extreme Networks, as they help it to achieve maximum availability so that we can drive continuous development,” said Drew Kramer, senior director of technical operations and R&D for VMware. “Working with Extreme Networks, from its high performance products to its knowledgeable and dedicated staff, has resulted in a world class infrastructure.”

Nice to see technology win out for once instead of back room deals which often end up screwing the customer over in the long run.

Since I'm here I guess I should mention the release of the X460 series of switches which came out a week or two ago, intended to replace the now 4-year old X450 series(both "A" and "E"). Notable differences & improvements include:

  • Dual hot swap internal power supplies
  • User swappable fan tray
  • Long distance stacking over 10GbE - up to 40 kilometers
  • Clear-Flow now available when the switches are stacked (prior hardware switches could not be stacked to use Clear-Flow
  • Stacking module is now optional (X450 it was built in)
  • Standard license is Edge license (X450A was Advanced Edge) - still software upgradable all the way to Core license (BGP etc). My favorite protocol ESRP requires Advanced Edge and not Core licensing.
  • Hardware support for IPFIX, which they say is complimentary to sFlow
  • Lifetime hardware warranty with advanced hardware replacement (X450E had lifetime, X450A did not)
  • Layer 3 Virtual Switching (yay!) - I first used this functionality on the Black Diamond 10808 back in 2005, it's really neat.

The X460 seems to be aimed at the mid to upper range of GbE switches, with the X480 being the high end offering.

4Nov/100

Chicken and the egg

TechOps Guy: Nate

Random thought time! --  came across an interesting headline on Chuck's Blog - Attack of the Vblock Clones.

Now I'm the first to admit I didn't read the whole thing but the basic gist he is saying if you want a fully tested integrated stack (of course you know I don't like these stacks they restrict you too much, the point of open systems is you can connect many different types of systems together and have them work but anyways), then you should go with their VBlock because it's there now, and tested, deployed etc. Others recently announced initiatives are responses to the VBlock and VCE, Arcadia(sp?) etc.

I've brought up 3cV before, something that 3PAR coined back almost 3 years ago now. Which is, in their words "Validated Blueprint of 3PAR, HP, and VMware Products Can Halve Costs and Floor Space".

And for those that don't know what 3cV is, a brief recap -

The Elements of 3cV
3cV combines the following products from 3PAR, HP, and VMware to deliver the virtual data center:

  • 3PAR InServ Storage Server featuring Virtual Domains and thin technologies—The leading utility storage platform, the 3PAR InServ is a highly virtualized tiered-storage array built for utility computing. Organizations creating virtualized IT infrastructures for workload consolidation use the 3PAR InServ to reduce the cost of allocated storage capacity, storage administration, and the SAN infrastructure.
  • HP BladeSystem c-Class—The No. 1 blade infrastructure on the market for datacenters of all sizes, the HP BladeSystem c-Class minimizes energy and space requirements and increases administrative productivity through advantages in I/O virtualization, power and cooling, and manageability. (1)
  • VMware Infrastructure—Infrastructure virtualization suite for industry-standard servers. VMware Infrastructure delivers the production-proven efficiency, availability, and dynamic management needed to build the responsive data center.

Sounds to me that 3cV beat VBlock to the punch by quite a ways. It would have been interesting to see how Dell would of handled the 3cV solution had they managed to win the bidding war, given they don't have anything that competes effectively with c-Class. But fortunately HP won out so 3cV can be just that much more official.

It's not sold as a pre-packaged product I guess you could say, but I mean how hard is it to say I need this much CPU, this much ram, this much storage HP go get it for me. Really it's not hard. The hard part is all the testing and certification. Even if 3cV never existed you can bet your ass that it would work regardless. It's not that complicated, really. Even if Dell managed to buy 3PAR and kill off the 3cV program because they wouldn't want to directly promote HP's products, you could still buy the 3PAR from Dell and the blades from HP and have it work. But of course you know that.

The only thing missing from 3cV is I'd like a more powerful networking stack, or at least sFlow support. I'll take Flex10 (or Flexfabric) over Cisco any day of the week but I'd still like more.

I don't know why this thought didn't pop into my head until I read that headline, but it gave me something to write about.

But whatever, that's my random thought of the day/week.

8Oct/102

Manually inflating the memory balloon

TechOps Guy: Nate

As I'm sure you all know, one of the key technologies that VMware has offered for a long time is memory ballooning to free memory from idle guest OSs in order to return that memory to the pool.

My own real world experience managing hundreds of VMs in VMware has really made me want to do one thing more than anything else:

Manually inflate that damn memory balloon

I don't want to have to wait until there is real memory pressure on the system to reclaim that memory. I don't use windows so can't speak for it there, but Linux is very memory greedy. It will use all the memory it can for disk cache and the like.

What I'd love to see is a daemon (maybe vmware-tools even) run on the system monitoring system load, as well as how much memory is actually used, which many Linux newbies do not know how to calculate, using the amount of memory reported being available by the "free" command or the "top" command is wrong. True memory usage on Linux is best calculated:

  • [Total Memory] - [Free Memory] - [Buffers] - [Cache] = Used memory

I really wish there was an easy way to display that particular stat, because the numbers returned by the stock tools are so misleading. I can't tell you how many times I've had to explain to newbies that just because 'free' is saying there is 10MB available that there is PLENTY of ram on the box because there is 10 gigs of memory in cache. They say, "oh no we're out of memory we will swap soon!". Wrong answer.

So back to my request. I want a daemon that runs on the system, watches system load, and watches true memory usage, and dynamically inflates that baloon to return that memory to the free pool, before the host runs low on memory. So often VMs that run idle really aren't doing anything, and when your running on high grade enterprise stoage, well you know there is a lot of fancy caching and wide striping going on there, the storage is really fast! Well it should be. Since the memory is not being used(sitting in cache that is not being used) - inflate that balloon and return it.

There really should be no performance hit. 99% of the time the cache is a read cache, not a write cache, so when you free up the cache the data is just dropped, it doesn't have to be flushed to disk (you can use the 'sync' command in a lot of cases to force a cache flush to see what I mean, typically the command returns instantaneously)

What I'd like even more than that though is to be able to better control how the Linux kernel allocates cache, and how frequently it frees it. I haven't checked in a little while but last I checked there wasn't much to control here.

I suppose that may be the next step in the evolution of virtualization - more intelligent operating systems that can be better aware they are operating in a shared environment, and return resources to the pool so others can play with them.

One approach might be to offload all of storage I/O caching to the hypervisor. I suppose this could be similar to using raw devices(bypasses several file system functions). Aggregate that caching at the hypervisor level, more efficient.

 

Tagged as: , 2 Comments
7Oct/102

Testing the limits of virtualization

TechOps Guy: Nate

You know I'm a big fan of the AMD Opteron 6100 series processor, also a fan of the HP c class blade system, specifically the BL685c G7 which was released on June 21st. I was and am very excited about it.

It is interesting to think, it really wasn't that long ago that blade systems still weren't all that viable for virtualization primarily because they lacked the memory density, I mean so many of them offered a paltry 2 or maybe 4 DIMM sockets. That was my biggest complaint with them for the longest time. About a year or year and a half ago that really started shifting. We all know that Cisco bought some small startup a few years ago that had their memory extender ASIC but well you know I'm not a Cisco fan so won't give them any more real estate in this blog entry, I have better places to spend my mad typing skills.

A little over a year ago HP released their Opteron G6 blades, at the time I was looking at the half height BL485c G6 (guessing here, too lazy to check). It had 16 DIMM sockets, that was just outstanding. I mean the company I was with at the time really liked Dell (you know I hate Dell by now I'm sure), I was poking around their site at the time and they had no answer to that(they have since introduced answers), the highest capacity half height blade they had at the time anyways was 8 DIMM sockets.

I had always assumed that due to the more advanced design in the HP blades that you ended up paying a huge premium, but wow I was surprised at the real world pricing, more so at the time because you needed of course significantly higher density memory modules in the Dell model to compete with the HP model.

Anyways fast forward to the BL685c G7 powered by the Opteron 6174 processor, a 12-core 2.2Ghz 80W processor.

Load a chassis up with eight of those:

  • 384 CPU cores (860Ghz of compute)
  • 4 TB of memory (512GB/server w/32x16GB each)
  • 6,750 Watts @ 100% load (feel free to use HP dynamic power capping if you need it)

I've thought long and hard over the past 6 months on whether or not to go 8GB or 16GB, and all of my virtualization experience has taught me in every case I'm memory(capacity) bound, not CPU bound. I mean it wasn't long ago we were building servers with only 32GB of memory on them!!!

There is indeed a massive premium associated with going with 16GB DIMMs but if your capacity utilization is anywhere near the industry average then it is well worth investing in those DIMMs for this system, your cost of going from 2TB to 4TB of memory using 8GB chips in this configuration makes you get a 2nd chassis and associated rack/power/cooling + hypervisor licensing. You can easily halve your costs by just taking the jump to 16GB chips and keeping it in one chassis(or at least 8 blades - maybe you want to split them between two chassis I'm not going to get into that level of detail here)

Low power memory chips aren't available for the 16GB chips so the power usage jumps by 1.2kW/enclosure for 512GB/server vs 256GB/server. A small price to pay, really.

So onto the point of my post - testing the limits of virtualization. When your running 32, 64, 128 or even 256GB of memory on a VM server that's great, you really don't have much to worry about. But step it up to 512GB of memory and you might just find yourself maxing out the capabilities of the hypervisor. At least in vSphere 4.1 for example you are limited to only 512 vCPUs per server or only 320 powered on virtual machines. So it really depends on your memory requirements, If your able to achieve massive amounts of memory de duplication(myself I have not had much luck here with linux it doesn't de-dupe well, windows seems to dedupe a lot though), you may find yourself unable to fully use the memory on the system, because you run out of the ability to fire up more VMs ! I'm not going to cover other hypervisor technologies, they aren't worth my time at this point but like I mentioned I do have my eye on KVM for future use.

Keep in mind 320 VMs is only 6.6VMs per CPU core on a 48-core server. That to me is not a whole lot for workloads I have personally deployed in the past. Now of course everybody is different.

But it got me thinking, I mean The Register has been touting off and on for the past several months every time a new Xeon 7500-based system launches ooh they can get 1TB of ram in the box. Or in the case of the big new bad ass HP 8-way system you can get 2TB of ram. Setting aside the fact that vSphere doesn't go above 1TB, even if you go to 1TB I bet in most cases you will run out of virtual CPUs before you run out of memory.

It was interesting to see, in the "early" years the hypervisor technology really exploiting hardware very well, and now we see the real possibility of hitting a scalability wall at least as far as a single system is concerned. I have no doubt that VMware will address these scalability issues it's only a matter of time.

Are you concerned about running your servers with 512GB of ram? After all that is a lot of "eggs" in one basket(as one expert VMware consultant I know & respect put it). For me at smaller scales I am really not too concerned. I have been using HP hardware for a long time and on the enterprise end it really is pretty robust. I have the most concerns about memory failure, or memory errors. Fortunately HP has had Advanced ECC for a long time now(I think I remember even seeing it in the DL360 G2 back in '03).

HP's Advanced ECC spreads the error correcting over four different ECC chips, and it really does provide quite robust memory protection. When I was dealing with cheap crap white box servers the #1 problem BY FAR was memory, I can't tell you how many memory sticks I had to replace it was sick. The systems just couldn't handle errors (yes all the memory was ECC!).

By contrast, honestly I can't even think of a time a enterprise HP server failed (e.g crashed) due to a memory problem. I recall many times the little amber status light come on and I log into the iLO and say, oh, memory errors on stick #2, so I go replace it. But no crash! There was a firmware bug in the HP DL585G1s I used to use that would cause them to crash if too many errors were encountered, but that was a bug that was fixed years ago, not a fault with the system design. I'm sure there have been other such bugs here and there, nothing is perfect.

Dell introduced their version of Advanced ECC about a year ago, but it doesn't (or at least didn't maybe it does now) hold a candle to the HP stuff. The biggest issue with the Dell version of Advanced ECC was if you enabled it, it disabled a bunch of your memory sockets! I could not get an answer out of Dell support at the time at least why it did that. So I left it disabled because I needed the memory capacity.

So combine Advanced ECC with ultra dense blades with 48 cores and 512GB/memory a piece and you got yourself a serious compute resource pool.

Power/cooling issues aside(maybe if your lucky you can get in to SuperNap down in Vegas) you can get up to 1,500 CPU cores and 16TB of memory in a single cabinet. That's just nuts! WAY beyond what you expect to be able to support in a single VMware cluster(being that your limited to 3,000 powered on VMs per cluster - the density would be only 2 VMs/core and 5GB/VM!)

And if you manage to get a 47U rack, well you can get one of those c3000 chassis in the rack on top of the four c7000 and get another 2TB of memory and 192 cores. We're talking power kicking up into the 27kW range in a single rack! Like I said you need SuperNap or the like!

Think about that for a minute, 1,500 CPU cores and 16TB of memory in a single rack. Multiply that by say 10 racks. 15,000 CPU cores and 160TB of memory. How many tens of thousands of physical servers could be consolidated into that? A conservative number may be 7 VMs/core, your talking 105,000 physical servers consolidated into ten racks. Well excluding storage of course. Think about that! Insane! I mean that's consolidating multiple data centers into a high density closet! That's taking tens to hundreds of megawatts of power off the grid and consolidating it into a measly 250 kW.

I built out, what was to me some pretty beefy server infrastructure back in 2005, around a $7 million project. Part of it included roughly 300 servers in roughly 28 racks. There was 336kW of power provisioned for those servers.

Think about that for a minute. And re-read the previous paragraph.

I have thought for quite a while because of this trend, the traditional network guy or server guy is well, there won't be as many of them around going forward. When you can consolidate that much crap in that small of a space, it's just astonishing.

One reason I really do like the Opteron 6100 is the cpu cores, just raw cores. And they are pretty fast cores too. The more cores you have the more things the hypervisor can do at the same time, and there is no possibilities of contention like there are with hyperthreading. CPU processing capacity has gotten to a point I believe where raw cpu performance matters much less than getting more cores on the boxes. More cores means more consolidation. After all industry utilization rates for CPUs are typically sub 30%. Though in my experience it's typically sub 10%, and a lot of times sub 5%. My own server sits at less than 1% cpu usage.

Now fast raw speed is still important in some applications of course. I'm not one to promote the usage of a 100 core CPU with each core running at 100Mhz(10Ghz), there is a balance that has to be achieved, and I really do believe the Opteron 6100 has achieved that balance, I look forward to the 6200(socket compatible 16 core). Ask anyone that has known me this decade I have not been AMD's strongest supporter for a very long period of time. But I see the light now.

16Sep/102

Fusion IO now with VMware support

TechOps Guy: Nate

About damn time! I read earlier in the year on their forums that they were planning on ESX support for their next release of code, originally expected sometime in March/April or something. But that time came and went and saw no new updates.

I saw that Fusion IO put on a pretty impressive VDI demonstration at VMworld, so I figured they must have VMware support now, and of course they do.

I would be very interested to see how performance could be boosted and VM density incerased by leveraging local Fusion IO storage for swap in ESX.  I know of a few 3PAR customers that say they get double the VM density per host vs other storage because of the better I/O they get from 3PAR, though of course Fusion IO is quite a bit snappier.

With VMware's ability to set swap file locations on a per-host basis, it's pretty easy to configure, in order to take advantage of it though you'd have to disable memory ballooning in the guests I think in order to force the host to swap. I don't think I would go so far as to try to put individual swap partitions on the local fusion IO for the guests to swap to directly, at least not when I'm using a shared storage system.

I just checked again, and as far as I can tell, still, from a blade perspective at least, still the only player offering Fusion IO modues for their blades is the HP c Class in the form of their IO Accelerator. With up to two expansion slots on the half width, and three on the full width blades, there's plenty of room for the 80, 160 GB SLC models or the 320GB MLC model. And if you were really crazy I guess you could use the "standard" Fusion IO cards with the blades by using the PCI Express expansion module, though that seems more geared towards video cards as upcomming VDI technologies leverage hardware GPU acceleration.

HP's Fusion IO-based I/O Accelerator

FusionIO claims to be able to write 5TB per day for 24 years, even if you cut that to 2TB per day for 5 years, it's quite an amazing claim.

From what I have seen (can't speak with personal experience just yet), the biggest advantage Fusion IO has over more traditional SSDs is write performance, of course to get optimal write performance on the system you do need to sacrifice space.

Unlike drive form factor devices, the ioDrive can be tuned to achieve a higher steady-state write performance than what it is shipped with from the factory.

7Sep/102

vSphere VAAI only in the Enterprise

TechOps Guy: Nate

Beam me up!

Damn those folks at VMware..

Anyways I was browsing around this afternoon looking around at things and while I suppose I shouldn't be I was surprised to see that the new storage VAAI APIs are only available to people running Enterprise or Enterprise Plus licensing.

I think at least the block level hardware based locking for VMFS should be available to all versions of vSphere, after all VMware is offloading the work to a 3rd party product!

VAAI certainly looks like it offers some really useful capabiltiies, from the documentation on the 3PAR VAAI plugin (which is free) here are the highlights:

  • Hardware Assisted Locking is a new VMware vSphere storage feature designed to significantly reduce impediments to VM reliability and performance by locking storage at the block level instead of the logical unit number (LUN) level, which dramatically reduces SCSI reservation contentions. This new capability enables greater VM scalability without compromising performance or reliability. In addition, with the 3PAR Gen3 ASIC, metadata comparisons are executed in silicon, further improving performance in the largest, most demanding VMware vSphere and desktop virtualization environments.
  • The 3PAR Plug-In for VAAI works with the new VMware vSphere Block Zero feature to offload large, block-level write operations of zeros from virtual servers to the InServ array, boosting efficiency during several common VMware vSphere operations— including provisioning VMs from Templates and allocating new file blocks for thin provisioned virtual disks. Adding further efficiency benefits, the 3PAR Gen3 ASIC with built-in zero-detection capability prevents the bulk zero writes from ever being written to disk, so no actual space is allocated. As a result, with the 3PAR Plug-In for VAAI and the 3PAR Gen3 ASIC, these repetitive write operations now have “zero cost” to valuable server, storage, and network resources—enabling organizations to increase both VM density and performance.
  • The 3PAR Plug-In for VAAI adds support for the new VMware vSphere Full Copy feature to dramatically improve the agility of enterprise and cloud datacenters by enabling rapid VM deployment, expedited cloning, and faster Storage vMotion operations. These administrative tasks are now performed in half the time. The 3PAR plug-in not only leverages the built-in performance and efficiency advantages of the InServ platform, but also frees up critical physical server and network resources. With the use of 3PAR Thin Persistence and the 3PAR Gen3 ASIC to remove duplicated zeroed data, data copies become more efficient as well.

Cool stuff. I'll tell you what. I really never had all that much interest in storage until I started using 3PAR about 3 and a half years ago. I mean I've spread my skills pretty broadly over the past decade, and I only have so much time to do stuff.

About five years ago some co-workers tried to get me excited about NetApp, though for some reason I never could get too excited about their stuff, sure it has tons of features which is nice, though the core architectural limitations of the platform (from a spinning rust perspective at least) I guess is what kept me away from them for the most part. If you really like NetApp, put a V-series in front of a 3PAR and watch it scream. I know of a few 3PAR/NetApp users that are outright refusing to entertain the option of running NetApp storage, they like the NAS, and keep the V-series but the back end doesn't perform.

On the topic of VMFS locking - I keep seeing folks pimping the NFS route attack the VMFS locking as if there was no locking in NFS with vSphere. I'm sure prior to block level locking the NFS file level locking (assuming it is file level) is more efficient than LUN level. Though to be honest I've never encountered issues with SCSI reservations in the past few years I've been using VMFS. Probably because of how I use it. I don't do a lot of activities that trigger reservations short of writing data.

Another graphic which I thought was kind of funny, is the current  Gartner group "magic quadrant", someone posted a link to it for VMware in a somewhat recent post, myself I don't rely on Gartner but I did find the lop sidedness of the situation for VMware quite amusing -

I've been using VMware since before 1.0, I still have my VMware 1.0.2 CD for Linux. I deployed VMware GSX to production for an e-commerce site in 2004, I've been using it for a while, I didn't start using ESX until 3.0 came out(from what I've read about the capabiltiies of previous versions I'm kinda glad I skipped them :) ). It's got to be the most solid piece of software I've ever used, besides Oracle I suppose. I mean I really, honestly can not remember it ever crashing. I'm sure it has, but it's been so rare that I have no memory of it. It's not flawless by any means, but it's solid. And VMware has done a lot to build up my loyalty to them over the past, what is it now eleven years? Like most everyone else at the time, I had no idea that we'd be doing the stuff with virtualization today that we are back then.

I've kept my eyes on other hypervisors as they come around, though even now none of the rest look very compelling. About two and a half years ago my new boss at the time was wanting to cut costs, and was trying to pressure me into trying the "free" Xen that came with CentOS at the time. He figured a hypervisor is a hypervisor. Well it's not. I refused. Eventually I left the company and my two esteemed colleges were forced into trying it after I left(hey Dave and Tycen!) they worked on it for a month before giving up and going back to VMware. What a waste of time..

I remember Tycen at about the same time being pretty excited about Hyper-V. Well at a position he recently held he got to see Hyper-V in all it's glory, and well he was happy to get out of that position and not having to use Hyper-V anymore.

Though I do think KVM has a chance, I think it's too early to use it for anything too serious at this point, though I'm sure that's not stopping tons of people from doing it anyways, just like it didn't stop me from running production on GSX way back when. But I suspect by the time vSphere 5.0 comes out, which I'm just guessing here will be in the 2012 time frame, KVM as a hypervisor will be solid enough to use in a serious capacity. VMware will of course have a massive edge on management tools and fancy add ons, but not everyone needs all that stuff (me included). I'm perfectly happy with just vSphere and vCenter (be even happier if there was a Linux version of course).

I can't help but laugh at the grand claims Red Hat is making for KVM scalability though. Sorry I just don't buy that the Linux kernel itself can reach such heights and be solid & scalable, yet alone a hypervisor running on top of Linux (and before anyone asks, NO ESX does NOT run on Linux).

I love Linux, I use it every day on my servers and my desktops and laptops, have been for more than a decade. Despite all the defectors to the Mac platform I still use Linux :) (I actually honestly tried a MacBook Pro for a couple weeks recently and just couldn't get it to a usable state).

Just because the system boots with X number of CPUs and X amount of memory doesn't mean it's going to be able to effectively scale to use it right. I'm sure Linux will get there some day, but believe it is a ways off.

23Aug/102

HP to the rescue

TechOps Guy: Nate

Knock knock.. HP is kicking down your back door 3PAR..

Well that's more like it, HP offered $1.6 Billion to acquire 3PAR this morning topping Dell's offer by 33%. Perhaps the 3cV solution can finally be fully backed by HP. More info from The Register here. And more info on what this could mean to HP and 3PAR products from the same source here.

3PAR's website is having serious issues, this obviously has spawned a ton of interest in the company, I get intermittent blank pages and connection refused messages.

I didn't wake my rep up for this one.

The 3cV solution was announced about three years ago -

Elements of the 3cV solution include:

  • 3PAR InServ® Storage Servers—highly virtualized, tiered-storage arrays built for utility computing. Organizations creating virtualized IT infrastructures for workload consolidation use InServ arrays to reduce the cost of allocated storage capacity, storage administration, and SAN infrastructure.
  • HP BladeSystem c-Class Server Blades—the leading blade server infrastructure on the market for datacenters of all sizes. HP BladeSystem c-Class server blades minimize energy and space requirements and increase administrative productivity through advantages in I/O virtualization, powering and cooling, and manageability.
  • VMware vSphere—the leading virtualization platform for industry-standard servers. VMware vSphere helps customers reduce capital and operating expenses, improve agility, ensure business continuity, strengthen security, and go green.

While I could not find the image that depicts the 3cV solution(not sure how long it's been gone for), here is more info on it for posterity.

The Advantages of 3cV
3cV offers combined benefits that enable customers to manage and scale their server and storage environments simply, allowing them to halve server, storage and operational costs while lowering the environmental impact of the datacenter.

  • Reduces storage and server costs by 50%—The inherently modular architectures of the HP BladeSystem c-Class and the 3PAR InServ Storage Server—coupled with the increased utilization provided by VMware Infrastructure and 3PAR Thin Provisioning—allow 3cV customers to do more with less capital expenditure. As a result, customers are able to reduce overall storage and server costs by 50% or more. High levels of availability and disaster recovery can also be affordably extended to more applications through VMware Infrastructure and 3PAR thin copy technologies.
  • Cuts operational costs by 50% and increases business agility—With 3cV, customers are able to provision and change server and storage resources on demand. By using VMware Infrastructure's capabilities for rapid server provisioning and the dynamic optimization provided by VMware VMotion and Distributed Resource Scheduler (DRS), HP Virtual Connect and Insight Control management software, and 3PAR Rapid Provisioning and Dynamic Optimization, customers are able to provision and re-provision physical servers, virtual hosts, and virtual arrays with tailored storage services in a matter of minutes, not days. These same technologies also improve operational simplicity, allowing overall server and storage administrative efficiency to increase by 3x or more.
  • Lowers environmental impact—With 3cV, customers are able to cut floor space and power requirements dramatically. Server floor space is minimized through server consolidation enabled by VMware Infrastructure (up to 70% savings) and HP BladeSystem density (up to 50% savings). Additional server power requirements are cut by 30% or more through the unique virtual power management capabilities of HP Thermal Logic technology. Storage floor space is reduced by the 3PAR InServ Storage Server, which delivers twice the capacity per floor tile as compared to alternatives. In addition, 3PAR thin technologies, Fast RAID 5, and wide striping allow customers to power and cool as much as 75% less disk capacity for a given project without sacrificing performance.
  • Delivers security through virtualization, not dedicated hardware silos—Whereas traditional datacenter architectures force tradeoffs between high resource utilization and the need for secure segregation of application resources for disparate user groups, 3cV resolves these competing needs through advanced virtualization. For instance, just as VMware Infrastructure securely isolates virtual machines on shared severs, 3PAR Virtual Domains provides secure "virtual arrays" for private, autonomous storage provisioning from a single, massively-parallel InServ Storage Server.

Though due to the recent stack wars it's been hard for 3PAR to partner with HP to promote this solution since I'm sure HP would rather push their own full stack. Well hopefully now they can. The best of both worlds technology wise can come together.

More details from 3PAR's VMware products site.

From HP's offer letter -

We propose to increase our offer to acquire all of 3PAR outstanding common stock to $24.00 per share in cash. This offer represents a 33.3% premium to Dell’s offer price and is a “Superior Proposal” as defined in your merger agreement with Dell. HP’s proposal is not subject to any financing contingency. HP’s Board of Directors has approved this proposal, which is not subject to any additional internal approvals. If approved by your Board of Directors, we expect the transaction would close by the end of the calendar year.

In addition to the compelling value offered by our proposal, there are unparalleled strategic benefits to be gained by combining these two organizations. HP is uniquely positioned to capitalize on 3PAR’s next-generation storage technology by utilizing our global reach and superior routes to market to deliver 3PAR’s products to customers around the world. Together, we will accelerate our ability to offer unmatched levels of performance, efficiency and scalability to customers deploying cloud or scale-out environments, helping drive new growth for both companies.
As a Silicon Valley-based company, we share 3PAR’s passion for innovation.
[..]

We understand that you will first need to communicate this proposal and your Board’s determinations to Dell, but we are prepared to execute the merger agreement immediately following your termination of the Dell merger agreement.

Music to my ears.

[tangent -- begin]

My father worked for HP in the early days back when they were even more innovative than they are today, he recalled their first $50M revenue year. He retired from HP in the early 90s after something like 25-30 years.

I attended my freshman year at Palo Alto Senior High school, and one of my classmates/friends (actually I don't think I shared any classes with him now that I think about it) was Ben Hewlett, grandson of one of the founders of HP. Along with a couple other friends Ryan and Jon played a bunch of RPGs (I think the main one was Twilight 2000, something one of my other friends Brian introduced me to in 8th grade).

I remember asking Ben one day why he took Japanese as his second language course when it was significantly more difficult than Spanish(which was the easy route, probably still is?) I don't think I'll ever forget his answer. He said "because my father says it's the business language of the future.."

How times have changed.. Now it seems everyone is busy teaching their children Chinese. I'm happy knowing English, and a touch of bash and perl.

I never managed to keep in touch with my friends from Palo Alto, after one short year there I moved back to Thailand for two more years of high school there.

[tangent -- end]

HP could do some cool stuff with 3PAR, they have much better technology overall, I have no doubt HP has their eyes on their HDS partnership and the possibility of replacing their XP line with 3PAR technology in the future has got to be pretty enticing. HDS hasn't done a whole lot recently, and I read not long ago that regardless what HP says, they don't have much (if any) input into the HDS product line.

The HP USP-V OEM relationship is with Hitachi SSG. The Sun USP-V reseller deal was struck with HDS. Mikkelsen said: "HP became a USP-V OEM in 2004 when the USP-V was already done. HP had no input to the design and, despite what they say, very little input since." HP has been a Hitachi OEM since 1999.

Another interesting tidbit of information from the same article:

It [HDS] cannot explain why it created the USP-V - because it didn't, Hitachi SSG did, in Japan, and its deepest thinking and reasons for doing so are literally lost in translation.

The loss of HP as an OEM customer of HDS, so soon after losing Sun as an OEM customer would be a really serious blow to HDS(one person I know claimed it accounts for ~50% of their business), whom seems to have a difficult time selling stuff in western countries, I've read it's mostly because of their culture. Similarly it seems Fujitsu has issues selling stuff in the U.S. at least, they seem to have some good storage products but not much attention is paid to them outside of Asia(and maybe Europe). Will HDS end up like Fujtisu as a result of HP buying 3PAR? Not right away for sure, but longer term they stand to lose a ton of market share in my opinion.

And with the USP getting a little stale (rumor has it they are near to announcing a technology refresh for it), it would be good timing for HP to get 3PAR, to cash in on the upgrade cycle by getting customers to go with the T class arrays instead of the updated USP whenever possible.

I read on an HP blog earlier in the year an interesting comment -

The 3PAR is drastically less expensive than an XP, but is an active/active concurrent design, can scale up to 8 clustered controllers, highly virtualized, customers can self-install, self-maintain, and requires no professional services. Its on par with the XP in terms of raw performance, but has the ease of use of the EVA. Like the XP, the 3PAR can be carved up into virtual domains so that service providers or multi-tenant arrays can have delegated administration.

I still think 3PAR is worth more, and should stay independent, but given the current situation would much rather have them in the arms of HP than Dell.

Obviously those analysts that said Dell paid too much for 3PAR were wrong, and didn't understand the value of the 3PAR technology. HP does otherwise they wouldn't be offering 33% more cash.

After the collapse of so many of 3PAR's NAS partners over the past couple of years, the possibility of having Ibrix available again for a longer term solution is pretty good. Dell bought Exanet's IP earlier in the year. LSI owns Onstor, HP bought Polyserve and Ibrix. Really just about no "open" NAS players left. Isilon seems to be among the biggest NAS players left but of course their technology is tightly integrated into their disk drive systems, same with Panasas.

Maybe that recent legal investigation into the board at 3PAR had some merit after all.

Dell should take their $billion and shove it in Pillar's(or was it Compellent ? I forgot) face, so the CEO there can make his dream of being a billion dollar storage company come true, if only for a short time.

I'm not a stock holder or anything, I don't buy stocks(or bonds).

Tagged as: , , , , 2 Comments
10Mar/100

Save 50% off vSphere essentials for the next 90 days

TechOps Guy: Nate

Came across this today, which mentions you can save about 50% when licensing vSphere essentials for the next ~90 days. As you may know Essentials is a really cheap way to get your vSphere stuff managed by vCenter. For your average dual socket 16-blade system as an example it is 91% cheaper(savings of ~$26,000) than going with vSphere Standard edition. Note that the vCenter included with Essentials needs to be thrown away if your managing more than three hosts with it. You'll still need to buy vCenter standard (regardless of what version of vSphere you buy).

Tagged as: , No Comments
28Feb/104

VMware dream machine

TechOps Guy: Nate

(Originally titled fourty eight all round, I like VMware dream machine more)

UPDATED I was thinking more about the upcoming 12-core Opterons and the next generation of HP c Class blades, and thought of a pretty cool configuration to have, hopefully it becomes available.

Imagine a full height blade that is quad socket, 48 cores (91-115Ghz), 48 DIMMs (192GB with 4GB sticks), 4x10Gbps Ethernet links and 2x4Gbps fiber channel links (total of 48Gbps of full duplex bandwidth). The new Opterons support 12 DIMMs per socket, allowing the 48 DIMM slots.

Why 4x10Gbps links? Well I was thinking why not.. with full height blades you can only fit 8 blades in a c7000 chassis. If you put a pair of 2x10Gbps switches in that gives you 16 ports. It's not much more $$ to double up on 10Gbps ports. Especially if your talking about spending upwards of say $20k on the blade(guesstimate) and another $9-15k blade on vSphere software per blade. And 4x10Gbps links gives you up to 16 virtual NICs using VirtualConnect per blade, each of them adjustable in 100Mbps increments.

Also given the fact that it is a full height blade, you have access to two slots worth of I/O, which translates into 320Gbps of full duplex fabric available to a single blade.

That kind of blade ought to handle just about anything you can throw at it. It's practically a super computer in of itself. Right now HP holds the top spot for VMark scores, with a 8 socket 6 core system(48 total cores) out pacing even a 16 socket 4 core system(64 total cores).

The 48 CPU cores will give the hypervisor an amazing number of combinations for scheduling vCPUs. Here's a slide from a presentation I was at last year which illustrates the concept behind the hypervisor scheduling single and multi vCPU VMs:

There is a PDF out there from VMware that talks about the math formulas behind it all, it has some interesting commentary on CPU scheduling with hypervisors:

[..]Extending this principle, ESX Server installations with a greater number of physical CPUs offer a greater chance of servicing competing workloads optimally. The chance that the scheduler can find room for a particular workload without much reshuffling of virtual machines will always be better when the scheduler has more CPUs across which it can search for idle time.

This is even cooler though, honestly I can't pretend to understand the math myself! -

Scheduling a two-VCPU machine on a two-way physical ESX Server hosts provides only one possible allocation for scheduling the virtual machine. The number of possible scheduling opportunities for a two-VCPU machine on a four-way or eight-way physical ESX Server host is described by combinatorial mathematics using the formula N! / (R!(N-R)!) where N=the number of physical CPUs on the ESX Server host and R=the number of VCPUs on the machine being scheduled.1 A two-VCPU virtual machine running on a four-way ESX Server host provides (4! / (2! (4-2)!) which is (4*3*2 / (2*2)) or 6 scheduling possibilities. For those unfamiliar with combinatory mathematics, X! is calculated as X(X-1)(X-2)(X-3)…. (X- (X-1)). For example 5! = 5*4*3*2*1.

Using these calculations, a two-VCPU virtual machine on an eight-way ESX Server host has (8! / (2! (8-2)!) which is (40320 / (2*720)) or 28 scheduling possibilities. This is more than four times the possibilities a four-way ESX Server host can provide. Four-vCPU machines demonstrate this principle even more forcefully. A four-vCPU machine scheduled on a four-way physical ESX Server host provides only one possibility to the scheduler whereas a four-VCPU virtual machine on an eight-CPU ESX Server host will yield (8! / (4!(8-4)!) or 70 scheduling possibilities, but running a four-vCPU machine on a sixteen-way ESX Server host will yield (16! / (4!(16-4)!) which is (20922789888000 / ( 24*479001600) or 1820 scheduling possibilities. That means that the scheduler has 1820 unique ways in which it can place the four-vCPU workload on the ESX Server host. Doubling the physical CPU count from eight to sixteen results in 26 times the scheduling flexibility for the four-way virtual machines. Running a four-way virtual machine on a Host with four times the number of physical processors (16-way ESX Server host) provides over six times more flexibility than we saw with running a two-way VM on a Host with four times the number of physical processors (8-way ESX Server host).

Anyone want to try to extrapolate that and extend it to a 48-core system? :)

It seems like only yesterday that I was building DL380G5 ESX 3.5 systems with 8 CPU cores and 32GB of ram, with 8x1Gbps links thinking of how powerful they were. This would be six of those in a single blade. And only seems like a couple weeks ago I was building VMware GSX systems with dual socket single core systems and 16GB ram..

So, HP do me a favor and make a G7 blade that can do this, that would make my day! I know fitting all of those components on a single full height blade won't be easy. Looking at the existing  BL685c blade, it looks like they could do it, remove the internal disks(who needs em, boot from SAN or something), and put an extra 16 DIMMs for a total of 48.

I thought about using 8Gbps fiber channel but then it wouldn't be 48 all round :)

UPDATE Again I was thinking about this and wanted to compare the costs vs existing technology. I'm estimating roughly a $32,000 price tag for this kind of blade and vSphere Advanced licensing (note you cannot use Enterprise licensing on a 12-core CPU, hardware pricing extrapolated from existing HP BL685G6 quad socket 6 core blade system with 128GB ram). The approximate price of an 8-way 48-core HP DL785 with 192GB, 4x10GbE and 2x4Gb Fiber with vSphere licensing comes to about roughly $70,000 (because VMWare charges on a per socket basis the licensing costs go up fast). Not only that but you can only fit 6 of these DL785 servers in a 42U rack, and you can fit 32 of these blades in the same rack with room to spare. So less than half the cost, and 5 times the density(for the same configuration). The DL785 has an edge in memory slot capacity, which isn't surprising given its massive size, it can fit 64 DIMMs vs 48 on my VMware dream machine blade.

Compared to a trio of HP BL495c blades each with 12 cores, and 64GB of memory, approximate pricing for that plus advanced vSphere is $31,000 for a total of 36 cores and 192GB of memory. So for $1,000 more you can add an extra 12 cores, cut your server count by 66%, probably cut your power usage by some amount and improve consolidation ratios.

So to summarize, two big reasons for this type of solution are:

  • More efficient consolidation on a per-host basis by having less "stranded" resources
  • More efficient consolidation on a per-cluster basis because you can get more capacity in the 32-node limit of a VMware cluster(assuming you want to build a cluster that big..) Again addressing the "stranded capacity" issue. Imagine what a resource pool could do with 3.3 Thz of compute capacity and 9.2TB of memory? All with line rate 40Gbps networking throughout? All within a single cabinet ?

Pretty amazing stuff to me anyways.

[For reference - Enterprise Plus licensing would add an extra $1250/socket plus more in support fees. VMware support costs not included in above pricing.]

END UPDATE

4Feb/100

Is Virtualisation ready for prime time?

TechOps Guy: Nate

The Register asked that question and some people responded, anyone familiar ?

When was your first production virtualisation deployment and what did it entail? My brief story is below(copied from the comments of the first article, easier than re-writing it).

My first real production virtualization deployment was back in mid 2004 I believe, using VMware GSX I think v3.0 at the time(now called VMware server).

The deployment was an emergency decision that followed a failed software upgrade to a cluster of real production servers that was shared by many customers. The upgrade was supposed to add support for a new customer that was launching within the week(they had already started a TV advertising campaign). Every attempt was made to make the real deployment work but there were critical bugs and it had to get rolled back, after staying up all night working on it people started asking what we were going to do next.

One idea(forgot who maybe it was me) was to build a new server with vmware and transfer the QA VM images to it(1 tomcat web server, 1 BEA weblogic app server, 1 win2k SQL/IIS server, the main DB was on Oracle and we used another schema for that cluster on our existing DB) and use it for production, that would be the fastest turnaround to get something working. The expected load was supposed to be really low so we went forward. I spent what felt like 60 of the next 72 hours getting the systems ready and tested over the weekend with some QA help, and we launched on schedule on the following Monday.

Why VMs and not real servers? Well we already had the VM images, and we were really short on physical servers, at least good ones anyways. Back then building a new server from scratch was a fairly painful process, though not as painful as integrating a brand new environment. What would usually take weeks of testing we pulled off in a couple of days. I remember one of the tough/last issues to track down was a portion of the application failing due to a missing entry in /etc/hosts (a new portion of functionality that not many were aware of).

The second time I've managed to make The Register(yay!), the first would be a response to my Xiotech speculations a few months back.

Tagged as: No Comments