TechOpsGuys.com Diggin' technology every day

7Oct/10Off

Testing the limits of virtualization

TechOps Guy: Nate

You know I'm a big fan of the AMD Opteron 6100 series processor, also a fan of the HP c class blade system, specifically the BL685c G7 which was released on June 21st. I was and am very excited about it.

It is interesting to think, it really wasn't that long ago that blade systems still weren't all that viable for virtualization primarily because they lacked the memory density, I mean so many of them offered a paltry 2 or maybe 4 DIMM sockets. That was my biggest complaint with them for the longest time. About a year or year and a half ago that really started shifting. We all know that Cisco bought some small startup a few years ago that had their memory extender ASIC but well you know I'm not a Cisco fan so won't give them any more real estate in this blog entry, I have better places to spend my mad typing skills.

A little over a year ago HP released their Opteron G6 blades, at the time I was looking at the half height BL485c G6 (guessing here, too lazy to check). It had 16 DIMM sockets, that was just outstanding. I mean the company I was with at the time really liked Dell (you know I hate Dell by now I'm sure), I was poking around their site at the time and they had no answer to that(they have since introduced answers), the highest capacity half height blade they had at the time anyways was 8 DIMM sockets.

I had always assumed that due to the more advanced design in the HP blades that you ended up paying a huge premium, but wow I was surprised at the real world pricing, more so at the time because you needed of course significantly higher density memory modules in the Dell model to compete with the HP model.

Anyways fast forward to the BL685c G7 powered by the Opteron 6174 processor, a 12-core 2.2Ghz 80W processor.

Load a chassis up with eight of those:

  • 384 CPU cores (860Ghz of compute)
  • 4 TB of memory (512GB/server w/32x16GB each)
  • 6,750 Watts @ 100% load (feel free to use HP dynamic power capping if you need it)

I've thought long and hard over the past 6 months on whether or not to go 8GB or 16GB, and all of my virtualization experience has taught me in every case I'm memory(capacity) bound, not CPU bound. I mean it wasn't long ago we were building servers with only 32GB of memory on them!!!

There is indeed a massive premium associated with going with 16GB DIMMs but if your capacity utilization is anywhere near the industry average then it is well worth investing in those DIMMs for this system, your cost of going from 2TB to 4TB of memory using 8GB chips in this configuration makes you get a 2nd chassis and associated rack/power/cooling + hypervisor licensing. You can easily halve your costs by just taking the jump to 16GB chips and keeping it in one chassis(or at least 8 blades - maybe you want to split them between two chassis I'm not going to get into that level of detail here)

Low power memory chips aren't available for the 16GB chips so the power usage jumps by 1.2kW/enclosure for 512GB/server vs 256GB/server. A small price to pay, really.

So onto the point of my post - testing the limits of virtualization. When your running 32, 64, 128 or even 256GB of memory on a VM server that's great, you really don't have much to worry about. But step it up to 512GB of memory and you might just find yourself maxing out the capabilities of the hypervisor. At least in vSphere 4.1 for example you are limited to only 512 vCPUs per server or only 320 powered on virtual machines. So it really depends on your memory requirements, If your able to achieve massive amounts of memory de duplication(myself I have not had much luck here with linux it doesn't de-dupe well, windows seems to dedupe a lot though), you may find yourself unable to fully use the memory on the system, because you run out of the ability to fire up more VMs ! I'm not going to cover other hypervisor technologies, they aren't worth my time at this point but like I mentioned I do have my eye on KVM for future use.

Keep in mind 320 VMs is only 6.6VMs per CPU core on a 48-core server. That to me is not a whole lot for workloads I have personally deployed in the past. Now of course everybody is different.

But it got me thinking, I mean The Register has been touting off and on for the past several months every time a new Xeon 7500-based system launches ooh they can get 1TB of ram in the box. Or in the case of the big new bad ass HP 8-way system you can get 2TB of ram. Setting aside the fact that vSphere doesn't go above 1TB, even if you go to 1TB I bet in most cases you will run out of virtual CPUs before you run out of memory.

It was interesting to see, in the "early" years the hypervisor technology really exploiting hardware very well, and now we see the real possibility of hitting a scalability wall at least as far as a single system is concerned. I have no doubt that VMware will address these scalability issues it's only a matter of time.

Are you concerned about running your servers with 512GB of ram? After all that is a lot of "eggs" in one basket(as one expert VMware consultant I know & respect put it). For me at smaller scales I am really not too concerned. I have been using HP hardware for a long time and on the enterprise end it really is pretty robust. I have the most concerns about memory failure, or memory errors. Fortunately HP has had Advanced ECC for a long time now(I think I remember even seeing it in the DL360 G2 back in '03).

HP's Advanced ECC spreads the error correcting over four different ECC chips, and it really does provide quite robust memory protection. When I was dealing with cheap crap white box servers the #1 problem BY FAR was memory, I can't tell you how many memory sticks I had to replace it was sick. The systems just couldn't handle errors (yes all the memory was ECC!).

By contrast, honestly I can't even think of a time a enterprise HP server failed (e.g crashed) due to a memory problem. I recall many times the little amber status light come on and I log into the iLO and say, oh, memory errors on stick #2, so I go replace it. But no crash! There was a firmware bug in the HP DL585G1s I used to use that would cause them to crash if too many errors were encountered, but that was a bug that was fixed years ago, not a fault with the system design. I'm sure there have been other such bugs here and there, nothing is perfect.

Dell introduced their version of Advanced ECC about a year ago, but it doesn't (or at least didn't maybe it does now) hold a candle to the HP stuff. The biggest issue with the Dell version of Advanced ECC was if you enabled it, it disabled a bunch of your memory sockets! I could not get an answer out of Dell support at the time at least why it did that. So I left it disabled because I needed the memory capacity.

So combine Advanced ECC with ultra dense blades with 48 cores and 512GB/memory a piece and you got yourself a serious compute resource pool.

Power/cooling issues aside(maybe if your lucky you can get in to SuperNap down in Vegas) you can get up to 1,500 CPU cores and 16TB of memory in a single cabinet. That's just nuts! WAY beyond what you expect to be able to support in a single VMware cluster(being that your limited to 3,000 powered on VMs per cluster - the density would be only 2 VMs/core and 5GB/VM!)

And if you manage to get a 47U rack, well you can get one of those c3000 chassis in the rack on top of the four c7000 and get another 2TB of memory and 192 cores. We're talking power kicking up into the 27kW range in a single rack! Like I said you need SuperNap or the like!

Think about that for a minute, 1,500 CPU cores and 16TB of memory in a single rack. Multiply that by say 10 racks. 15,000 CPU cores and 160TB of memory. How many tens of thousands of physical servers could be consolidated into that? A conservative number may be 7 VMs/core, your talking 105,000 physical servers consolidated into ten racks. Well excluding storage of course. Think about that! Insane! I mean that's consolidating multiple data centers into a high density closet! That's taking tens to hundreds of megawatts of power off the grid and consolidating it into a measly 250 kW.

I built out, what was to me some pretty beefy server infrastructure back in 2005, around a $7 million project. Part of it included roughly 300 servers in roughly 28 racks. There was 336kW of power provisioned for those servers.

Think about that for a minute. And re-read the previous paragraph.

I have thought for quite a while because of this trend, the traditional network guy or server guy is well, there won't be as many of them around going forward. When you can consolidate that much crap in that small of a space, it's just astonishing.

One reason I really do like the Opteron 6100 is the cpu cores, just raw cores. And they are pretty fast cores too. The more cores you have the more things the hypervisor can do at the same time, and there is no possibilities of contention like there are with hyperthreading. CPU processing capacity has gotten to a point I believe where raw cpu performance matters much less than getting more cores on the boxes. More cores means more consolidation. After all industry utilization rates for CPUs are typically sub 30%. Though in my experience it's typically sub 10%, and a lot of times sub 5%. My own server sits at less than 1% cpu usage.

Now fast raw speed is still important in some applications of course. I'm not one to promote the usage of a 100 core CPU with each core running at 100Mhz(10Ghz), there is a balance that has to be achieved, and I really do believe the Opteron 6100 has achieved that balance, I look forward to the 6200(socket compatible 16 core). Ask anyone that has known me this decade I have not been AMD's strongest supporter for a very long period of time. But I see the light now.

16Sep/10Off

Fusion IO now with VMware support

TechOps Guy: Nate

About damn time! I read earlier in the year on their forums that they were planning on ESX support for their next release of code, originally expected sometime in March/April or something. But that time came and went and saw no new updates.

I saw that Fusion IO put on a pretty impressive VDI demonstration at VMworld, so I figured they must have VMware support now, and of course they do.

I would be very interested to see how performance could be boosted and VM density incerased by leveraging local Fusion IO storage for swap in ESX.  I know of a few 3PAR customers that say they get double the VM density per host vs other storage because of the better I/O they get from 3PAR, though of course Fusion IO is quite a bit snappier.

With VMware's ability to set swap file locations on a per-host basis, it's pretty easy to configure, in order to take advantage of it though you'd have to disable memory ballooning in the guests I think in order to force the host to swap. I don't think I would go so far as to try to put individual swap partitions on the local fusion IO for the guests to swap to directly, at least not when I'm using a shared storage system.

I just checked again, and as far as I can tell, still, from a blade perspective at least, still the only player offering Fusion IO modues for their blades is the HP c Class in the form of their IO Accelerator. With up to two expansion slots on the half width, and three on the full width blades, there's plenty of room for the 80, 160 GB SLC models or the 320GB MLC model. And if you were really crazy I guess you could use the "standard" Fusion IO cards with the blades by using the PCI Express expansion module, though that seems more geared towards video cards as upcomming VDI technologies leverage hardware GPU acceleration.

HP's Fusion IO-based I/O Accelerator

FusionIO claims to be able to write 5TB per day for 24 years, even if you cut that to 2TB per day for 5 years, it's quite an amazing claim.

From what I have seen (can't speak with personal experience just yet), the biggest advantage Fusion IO has over more traditional SSDs is write performance, of course to get optimal write performance on the system you do need to sacrifice space.

Unlike drive form factor devices, the ioDrive can be tuned to achieve a higher steady-state write performance than what it is shipped with from the factory.

23Aug/10Off

HP FlexFabric module launched

TechOps Guy: Nate

While they announced it a while back, it seems the HP VirtualConnect FlexFabric Module available for purchase for $18,500 (web price). Pretty impressive technology, Sort of a mix between FCoE and combining a Fibre channel switch and a 10Gbps Flex10 switch into one. The switch has two ports on it that can uplink (apparently) directly fiber channel 2/4/8Gbps. I haven't read too much into it yet but I assume it can uplink directly to a storage array, unlike the previous Fibre Channel Virtual Connect module which had to be connected to a switch first (due to NPIV).

HP Virtual Connect FlexFabric 10Gb/24-port Modules are the simplest, most flexible way to connect virtualized server blades to data or storage networks. VC FlexFabric modules eliminate up to 95% of network sprawl at the server edge with one device that converges traffic inside enclosures and directly connects to external LANs and SANs. Using Flex-10 technology with Fibre Channel over Ethernet and accelerated iSCSI, these modules converge traffic over high speed 10Gb connections to servers with HP FlexFabric Adapters (HP NC551i or HP NC551m Dual Port FlexFabric 10Gb Converged Network Adapters or HP NC553i 10Gb 2-port FlexFabric Converged Network Adapter). Each redundant pair of Virtual Connect FlexFabric modules provide 8 adjustable connections ( six Ethernet and two Fibre Channel, or six Ethernet and 2 iSCSI or eight Ethernet) to dual port10Gb FlexFabric Adapters. VC FlexFabric modules avoid the confusion of traditional and other converged network solutions by eliminating the need for multiple Ethernet and Fibre Channel switches, extension modules, cables and software licenses. Also, Virtual Connect wire-once connection management is built-in enabling server adds, moves and replacement in minutes instead of days or weeks.

[..]

  • 16 x 10Gb Ethernet downlinks to server blade NICs and FlexFabric Adapters
  • Each 10Gb downlink supports up to 3 FlexNICs and 1 FlexHBA or 4 FlexNICs
  • Each FlexHBA can be configured to transport either Fiber Channel over Ethernet/CEE or Accelerated iSCSI protocol.
  • Each FlexNIC and FlexHBA is recognized by the server as a PCI-e physical function device with adjustable speeds from 100Mb to 10Gb in 100Mb increments when connected to a HP NC553i 10Gb 2-port FlexFabric Converged Network Adapter or any Flex-10 NIC and from 1Gb to 10Gb in 100Mb increments when connected to a NC551i Dual Port FlexFabric 10Gb Converged Network Adapter or NC551m Dual Port FlexFabric 10Gb Converged Network Adapter
  • 4 SFP+ external uplink ports configurable as either 10Gb Ethernet or 2/4/8Gb auto-negotiating Fibre Channel connections to external LAN or SAN switches
  • 4 SFP+ external uplink ports configurable as 1/10Gb auto-negotiating Ethernet connected to external LAN switches
  • 8 x 10Gb SR, LR fiber and copper SFP+ uplink ports (4 ports also support 10Gb LRM fiber SFP+)
  • Extended list of direct attach copper cable connections supported
  • 2 x 10Gb shared internal cross connects for redundancy and stacking
  • HBA aggregation on FC configured uplink ports using ANSI T11 standards-based N_Port ID Virtualization (NPIV) technology
  • Allows up to 255 virtual machines running on the same physical server to access separate storage resources
  • Up to 128 VLANs supported per Shared Uplink Set
  • Low latency (1.2 µs Ethernet ports and 1.7 µs Enet/Fibre Channel ports) throughput provides switch-like performance.
  • Line Rate, full-duplex 240Gbps bridging fabric
  • MTU up to 9216 Bytes - Jumbo Frames
  • Configurable up to 8192 MAC addresses and 1000 IGMP groups
  • VLAN Tagging, Pass-Thru and Link Aggregation supported on all uplinks
  • Stack multiple Virtual Connect FlexFabric modules with other VC FlexFabric, VC Flex-10 or VC Ethernet Modules across up to 4 BladeSystem enclosures allowing any server Ethernet port to connect to any Ethernet uplink

Management

  • Pre-configure server I/O configurations prior to server installation for easy deployment
  • Move, add, or change server network connections on the fly without LAN and SAN administrator involvement
  • Supported by Virtual Connect Enterprise Manager (VCEM) v6.2 and higher for centralized connection and workload management for hundreds of Virtual Connect domains. Learn more at: www.hp.com/go/vcem
  • Integrated Virtual Connect Manager included with every module, providing out-of-the-box, secure HTTP and scriptable CLI interfaces for individual Virtual Connect domain configuration and management.
  • Configuration and setup consistent with VC Flex-10 and VC Fibre Channel Modules
  • Monitoring and management via industry standard SNMP v.1 and v.2 Role-based security for network and server administration with LDAP compatibility
  • Port error and Rx/Tx data statistics displayed via CLI
  • Port Mirroring on any uplink provides network troubleshooting support with Network Analyzers
  • IGMP Snooping optimizes network traffic and reduces bandwidth for multicast applications such as streaming applications
  • Recognizes and directs Server-Side VLAN tags
  • Transparent device to the LAN Manager and SAN Manager
  • Provisioned storage resource is associated directly to a specific virtual machine - even if the virtual server is re-allocated within the BladeSystem
  • Server-side NPIV removes storage management constraint of a single physical HBA on a server blade Does not add to SAN switch domains or require traditional SAN management
  • Centralized configuration of boot from iSCSI or Fibre Channel network storage via Virtual Connect Manager GUI and CLI
  • Remotely update Virtual Connect firmware on multiple modules using Virtual Connect Support Utility 1.5.0

Options

  • Virtual Connect Enterprise Manager (VCEM), provides a central console to manage network connections and workload mobility for thousands of servers across the datacenter
  • Optional HP 10Gb SFP+ SR, LR, and LRM modules and 10Gb SFP+ Copper cables in 0.5m, 1m, 3m, 5m, and 7m lengths
  • Optional HP 8 Gb SFP+ and 4 Gb SFP optical transceivers
  • Supports all Ethernet NICs and Converged Network adapters for BladeSystem c-Class server blades: HP NC551i 10Gb FlexFabric Converged Network Adapters, HP NC551m 10Gb FlexFabric Converged Network Adapters, 1/10Gb Server NICs including LOM and Mezzanine card options and the latest 10Gb KR NICs
  • Supports use with other VC modules within the same enclosure (VC Flex-10 Ethernet Module, VC 1/10Gb Ethernet Module, VC 4 and 8 Gb Fibre Channel Modules).

So in effect this allows you to cut down on the number of switches per chassis from four to two, which can save quite a bit. HP had a cool graphic showing the amount of cables that are saved even against Cisco UCS but I can't seem to find it at the moment.

The most recently announced G7 blade servers have the new FlexFabric technology built in(which is also backwards compatible with Flex10).

VCEM seems pretty scalable

Built on the Virtual Connect architecture integrated into every BladeSystem c-Class enclosure, VCEM provides a central console to administer network address assignments, perform group-based configuration management and to rapidly deployment, movement and failover of server connections for 250 Virtual Connect domains (up to 1,000 BladeSystem enclosures and 16,000 blade servers).

With each enclosure consuming roughly 5kW with low voltage memory and power capping, 1,000 enclosures should consume roughly 5 Megawatts? From what I see "experts" say it costs roughly ~$18 million per megawatt for a data center, so one VCEM system can manage a $90 million data center, that's pretty bad ass. I can't think of who would need so many blades..

If I were building a new system today I would probably get this new module, but have to think hard about sticking to regular fibre channel module to allow the technology to bake a bit more for storage.

The module is built based on Qlogic technology.

23Aug/10Off

HP to the rescue

TechOps Guy: Nate

Knock knock.. HP is kicking down your back door 3PAR..

Well that's more like it, HP offered $1.6 Billion to acquire 3PAR this morning topping Dell's offer by 33%. Perhaps the 3cV solution can finally be fully backed by HP. More info from The Register here. And more info on what this could mean to HP and 3PAR products from the same source here.

3PAR's website is having serious issues, this obviously has spawned a ton of interest in the company, I get intermittent blank pages and connection refused messages.

I didn't wake my rep up for this one.

The 3cV solution was announced about three years ago -

Elements of the 3cV solution include:

  • 3PAR InServ® Storage Servers—highly virtualized, tiered-storage arrays built for utility computing. Organizations creating virtualized IT infrastructures for workload consolidation use InServ arrays to reduce the cost of allocated storage capacity, storage administration, and SAN infrastructure.
  • HP BladeSystem c-Class Server Blades—the leading blade server infrastructure on the market for datacenters of all sizes. HP BladeSystem c-Class server blades minimize energy and space requirements and increase administrative productivity through advantages in I/O virtualization, powering and cooling, and manageability.
  • VMware vSphere—the leading virtualization platform for industry-standard servers. VMware vSphere helps customers reduce capital and operating expenses, improve agility, ensure business continuity, strengthen security, and go green.

While I could not find the image that depicts the 3cV solution(not sure how long it's been gone for), here is more info on it for posterity.

The Advantages of 3cV
3cV offers combined benefits that enable customers to manage and scale their server and storage environments simply, allowing them to halve server, storage and operational costs while lowering the environmental impact of the datacenter.

  • Reduces storage and server costs by 50%—The inherently modular architectures of the HP BladeSystem c-Class and the 3PAR InServ Storage Server—coupled with the increased utilization provided by VMware Infrastructure and 3PAR Thin Provisioning—allow 3cV customers to do more with less capital expenditure. As a result, customers are able to reduce overall storage and server costs by 50% or more. High levels of availability and disaster recovery can also be affordably extended to more applications through VMware Infrastructure and 3PAR thin copy technologies.
  • Cuts operational costs by 50% and increases business agility—With 3cV, customers are able to provision and change server and storage resources on demand. By using VMware Infrastructure's capabilities for rapid server provisioning and the dynamic optimization provided by VMware VMotion and Distributed Resource Scheduler (DRS), HP Virtual Connect and Insight Control management software, and 3PAR Rapid Provisioning and Dynamic Optimization, customers are able to provision and re-provision physical servers, virtual hosts, and virtual arrays with tailored storage services in a matter of minutes, not days. These same technologies also improve operational simplicity, allowing overall server and storage administrative efficiency to increase by 3x or more.
  • Lowers environmental impact—With 3cV, customers are able to cut floor space and power requirements dramatically. Server floor space is minimized through server consolidation enabled by VMware Infrastructure (up to 70% savings) and HP BladeSystem density (up to 50% savings). Additional server power requirements are cut by 30% or more through the unique virtual power management capabilities of HP Thermal Logic technology. Storage floor space is reduced by the 3PAR InServ Storage Server, which delivers twice the capacity per floor tile as compared to alternatives. In addition, 3PAR thin technologies, Fast RAID 5, and wide striping allow customers to power and cool as much as 75% less disk capacity for a given project without sacrificing performance.
  • Delivers security through virtualization, not dedicated hardware silos—Whereas traditional datacenter architectures force tradeoffs between high resource utilization and the need for secure segregation of application resources for disparate user groups, 3cV resolves these competing needs through advanced virtualization. For instance, just as VMware Infrastructure securely isolates virtual machines on shared severs, 3PAR Virtual Domains provides secure "virtual arrays" for private, autonomous storage provisioning from a single, massively-parallel InServ Storage Server.

Though due to the recent stack wars it's been hard for 3PAR to partner with HP to promote this solution since I'm sure HP would rather push their own full stack. Well hopefully now they can. The best of both worlds technology wise can come together.

More details from 3PAR's VMware products site.

From HP's offer letter -

We propose to increase our offer to acquire all of 3PAR outstanding common stock to $24.00 per share in cash. This offer represents a 33.3% premium to Dell’s offer price and is a “Superior Proposal” as defined in your merger agreement with Dell. HP’s proposal is not subject to any financing contingency. HP’s Board of Directors has approved this proposal, which is not subject to any additional internal approvals. If approved by your Board of Directors, we expect the transaction would close by the end of the calendar year.

In addition to the compelling value offered by our proposal, there are unparalleled strategic benefits to be gained by combining these two organizations. HP is uniquely positioned to capitalize on 3PAR’s next-generation storage technology by utilizing our global reach and superior routes to market to deliver 3PAR’s products to customers around the world. Together, we will accelerate our ability to offer unmatched levels of performance, efficiency and scalability to customers deploying cloud or scale-out environments, helping drive new growth for both companies.
As a Silicon Valley-based company, we share 3PAR’s passion for innovation.
[..]

We understand that you will first need to communicate this proposal and your Board’s determinations to Dell, but we are prepared to execute the merger agreement immediately following your termination of the Dell merger agreement.

Music to my ears.

[tangent -- begin]

My father worked for HP in the early days back when they were even more innovative than they are today, he recalled their first $50M revenue year. He retired from HP in the early 90s after something like 25-30 years.

I attended my freshman year at Palo Alto Senior High school, and one of my classmates/friends (actually I don't think I shared any classes with him now that I think about it) was Ben Hewlett, grandson of one of the founders of HP. Along with a couple other friends Ryan and Jon played a bunch of RPGs (I think the main one was Twilight 2000, something one of my other friends Brian introduced me to in 8th grade).

I remember asking Ben one day why he took Japanese as his second language course when it was significantly more difficult than Spanish(which was the easy route, probably still is?) I don't think I'll ever forget his answer. He said "because my father says it's the business language of the future.."

How times have changed.. Now it seems everyone is busy teaching their children Chinese. I'm happy knowing English, and a touch of bash and perl.

I never managed to keep in touch with my friends from Palo Alto, after one short year there I moved back to Thailand for two more years of high school there.

[tangent -- end]

HP could do some cool stuff with 3PAR, they have much better technology overall, I have no doubt HP has their eyes on their HDS partnership and the possibility of replacing their XP line with 3PAR technology in the future has got to be pretty enticing. HDS hasn't done a whole lot recently, and I read not long ago that regardless what HP says, they don't have much (if any) input into the HDS product line.

The HP USP-V OEM relationship is with Hitachi SSG. The Sun USP-V reseller deal was struck with HDS. Mikkelsen said: "HP became a USP-V OEM in 2004 when the USP-V was already done. HP had no input to the design and, despite what they say, very little input since." HP has been a Hitachi OEM since 1999.

Another interesting tidbit of information from the same article:

It [HDS] cannot explain why it created the USP-V - because it didn't, Hitachi SSG did, in Japan, and its deepest thinking and reasons for doing so are literally lost in translation.

The loss of HP as an OEM customer of HDS, so soon after losing Sun as an OEM customer would be a really serious blow to HDS(one person I know claimed it accounts for ~50% of their business), whom seems to have a difficult time selling stuff in western countries, I've read it's mostly because of their culture. Similarly it seems Fujitsu has issues selling stuff in the U.S. at least, they seem to have some good storage products but not much attention is paid to them outside of Asia(and maybe Europe). Will HDS end up like Fujtisu as a result of HP buying 3PAR? Not right away for sure, but longer term they stand to lose a ton of market share in my opinion.

And with the USP getting a little stale (rumor has it they are near to announcing a technology refresh for it), it would be good timing for HP to get 3PAR, to cash in on the upgrade cycle by getting customers to go with the T class arrays instead of the updated USP whenever possible.

I read on an HP blog earlier in the year an interesting comment -

The 3PAR is drastically less expensive than an XP, but is an active/active concurrent design, can scale up to 8 clustered controllers, highly virtualized, customers can self-install, self-maintain, and requires no professional services. Its on par with the XP in terms of raw performance, but has the ease of use of the EVA. Like the XP, the 3PAR can be carved up into virtual domains so that service providers or multi-tenant arrays can have delegated administration.

I still think 3PAR is worth more, and should stay independent, but given the current situation would much rather have them in the arms of HP than Dell.

Obviously those analysts that said Dell paid too much for 3PAR were wrong, and didn't understand the value of the 3PAR technology. HP does otherwise they wouldn't be offering 33% more cash.

After the collapse of so many of 3PAR's NAS partners over the past couple of years, the possibility of having Ibrix available again for a longer term solution is pretty good. Dell bought Exanet's IP earlier in the year. LSI owns Onstor, HP bought Polyserve and Ibrix. Really just about no "open" NAS players left. Isilon seems to be among the biggest NAS players left but of course their technology is tightly integrated into their disk drive systems, same with Panasas.

Maybe that recent legal investigation into the board at 3PAR had some merit after all.

Dell should take their $billion and shove it in Pillar's(or was it Compellent ? I forgot) face, so the CEO there can make his dream of being a billion dollar storage company come true, if only for a short time.

I'm not a stock holder or anything, I don't buy stocks(or bonds).

21Jun/10Off

HP BL685c G7 Launched – Opteron 6100

TechOps Guy: Nate

I guess my VMware dream machine will remain a dream for now, HP launched their next generation G7 Opteron 6100 blades today, and while still very compelling systems, after the 6100 launched I saw the die size had increased somewhat (not surprising), it was enough to remove the ability to have 4 CPU sockets AND 48 memory slots on one full height blade.

Still a very good comparison illustrating the elimination of the 4P tax, that is eliminating the premium associated with quad socket servers. If you configure a BL485c G7 with 2x12-core CPUs and 128GB of memory(about $16,000), vs a BL685c G7 with 256GB of memory and the 4x12-core CPUs (about $32,000), the cost is about the same, no premium.

By contrast configuring a BL685c G6 with six core CPUs (e.g. half the number of cores as the G7), same memory, same networking, same fiber channel, the cost is roughly $52,000.

These have new Flex Fabric 2 NICs, which from the specs page seem to indicate they include iSCSI or FCoE support (I assume some sort of software licensing needed to unlock the added functionality? though can't find evidence of it). Here is a white paper on the Flex Fabric stuff, from what I gather it's just an evolutionary step of Virtual Connect. Myself of course have never had any real interest in FCoE (search the archives for details), but nice I suppose that HP is giving the option to those that do want to jump on that wagon.

28Feb/10Off

VMware dream machine

TechOps Guy: Nate

(Originally titled fourty eight all round, I like VMware dream machine more)

UPDATED I was thinking more about the upcoming 12-core Opterons and the next generation of HP c Class blades, and thought of a pretty cool configuration to have, hopefully it becomes available.

Imagine a full height blade that is quad socket, 48 cores (91-115Ghz), 48 DIMMs (192GB with 4GB sticks), 4x10Gbps Ethernet links and 2x4Gbps fiber channel links (total of 48Gbps of full duplex bandwidth). The new Opterons support 12 DIMMs per socket, allowing the 48 DIMM slots.

Why 4x10Gbps links? Well I was thinking why not.. with full height blades you can only fit 8 blades in a c7000 chassis. If you put a pair of 2x10Gbps switches in that gives you 16 ports. It's not much more $$ to double up on 10Gbps ports. Especially if your talking about spending upwards of say $20k on the blade(guesstimate) and another $9-15k blade on vSphere software per blade. And 4x10Gbps links gives you up to 16 virtual NICs using VirtualConnect per blade, each of them adjustable in 100Mbps increments.

Also given the fact that it is a full height blade, you have access to two slots worth of I/O, which translates into 320Gbps of full duplex fabric available to a single blade.

That kind of blade ought to handle just about anything you can throw at it. It's practically a super computer in of itself. Right now HP holds the top spot for VMark scores, with a 8 socket 6 core system(48 total cores) out pacing even a 16 socket 4 core system(64 total cores).

The 48 CPU cores will give the hypervisor an amazing number of combinations for scheduling vCPUs. Here's a slide from a presentation I was at last year which illustrates the concept behind the hypervisor scheduling single and multi vCPU VMs:

There is a PDF out there from VMware that talks about the math formulas behind it all, it has some interesting commentary on CPU scheduling with hypervisors:

[..]Extending this principle, ESX Server installations with a greater number of physical CPUs offer a greater chance of servicing competing workloads optimally. The chance that the scheduler can find room for a particular workload without much reshuffling of virtual machines will always be better when the scheduler has more CPUs across which it can search for idle time.

This is even cooler though, honestly I can't pretend to understand the math myself! -

Scheduling a two-VCPU machine on a two-way physical ESX Server hosts provides only one possible allocation for scheduling the virtual machine. The number of possible scheduling opportunities for a two-VCPU machine on a four-way or eight-way physical ESX Server host is described by combinatorial mathematics using the formula N! / (R!(N-R)!) where N=the number of physical CPUs on the ESX Server host and R=the number of VCPUs on the machine being scheduled.1 A two-VCPU virtual machine running on a four-way ESX Server host provides (4! / (2! (4-2)!) which is (4*3*2 / (2*2)) or 6 scheduling possibilities. For those unfamiliar with combinatory mathematics, X! is calculated as X(X-1)(X-2)(X-3)…. (X- (X-1)). For example 5! = 5*4*3*2*1.

Using these calculations, a two-VCPU virtual machine on an eight-way ESX Server host has (8! / (2! (8-2)!) which is (40320 / (2*720)) or 28 scheduling possibilities. This is more than four times the possibilities a four-way ESX Server host can provide. Four-vCPU machines demonstrate this principle even more forcefully. A four-vCPU machine scheduled on a four-way physical ESX Server host provides only one possibility to the scheduler whereas a four-VCPU virtual machine on an eight-CPU ESX Server host will yield (8! / (4!(8-4)!) or 70 scheduling possibilities, but running a four-vCPU machine on a sixteen-way ESX Server host will yield (16! / (4!(16-4)!) which is (20922789888000 / ( 24*479001600) or 1820 scheduling possibilities. That means that the scheduler has 1820 unique ways in which it can place the four-vCPU workload on the ESX Server host. Doubling the physical CPU count from eight to sixteen results in 26 times the scheduling flexibility for the four-way virtual machines. Running a four-way virtual machine on a Host with four times the number of physical processors (16-way ESX Server host) provides over six times more flexibility than we saw with running a two-way VM on a Host with four times the number of physical processors (8-way ESX Server host).

Anyone want to try to extrapolate that and extend it to a 48-core system? :)

It seems like only yesterday that I was building DL380G5 ESX 3.5 systems with 8 CPU cores and 32GB of ram, with 8x1Gbps links thinking of how powerful they were. This would be six of those in a single blade. And only seems like a couple weeks ago I was building VMware GSX systems with dual socket single core systems and 16GB ram..

So, HP do me a favor and make a G7 blade that can do this, that would make my day! I know fitting all of those components on a single full height blade won't be easy. Looking at the existing  BL685c blade, it looks like they could do it, remove the internal disks(who needs em, boot from SAN or something), and put an extra 16 DIMMs for a total of 48.

I thought about using 8Gbps fiber channel but then it wouldn't be 48 all round :)

UPDATE Again I was thinking about this and wanted to compare the costs vs existing technology. I'm estimating roughly a $32,000 price tag for this kind of blade and vSphere Advanced licensing (note you cannot use Enterprise licensing on a 12-core CPU, hardware pricing extrapolated from existing HP BL685G6 quad socket 6 core blade system with 128GB ram). The approximate price of an 8-way 48-core HP DL785 with 192GB, 4x10GbE and 2x4Gb Fiber with vSphere licensing comes to about roughly $70,000 (because VMWare charges on a per socket basis the licensing costs go up fast). Not only that but you can only fit 6 of these DL785 servers in a 42U rack, and you can fit 32 of these blades in the same rack with room to spare. So less than half the cost, and 5 times the density(for the same configuration). The DL785 has an edge in memory slot capacity, which isn't surprising given its massive size, it can fit 64 DIMMs vs 48 on my VMware dream machine blade.

Compared to a trio of HP BL495c blades each with 12 cores, and 64GB of memory, approximate pricing for that plus advanced vSphere is $31,000 for a total of 36 cores and 192GB of memory. So for $1,000 more you can add an extra 12 cores, cut your server count by 66%, probably cut your power usage by some amount and improve consolidation ratios.

So to summarize, two big reasons for this type of solution are:

  • More efficient consolidation on a per-host basis by having less "stranded" resources
  • More efficient consolidation on a per-cluster basis because you can get more capacity in the 32-node limit of a VMware cluster(assuming you want to build a cluster that big..) Again addressing the "stranded capacity" issue. Imagine what a resource pool could do with 3.3 Thz of compute capacity and 9.2TB of memory? All with line rate 40Gbps networking throughout? All within a single cabinet ?

Pretty amazing stuff to me anyways.

[For reference - Enterprise Plus licensing would add an extra $1250/socket plus more in support fees. VMware support costs not included in above pricing.]

END UPDATE

27Feb/10Off

Cisco UCS Networking falls short

TechOps Guy: Nate

UPDATED Yesterday when I woke up I had an email from Tolly in my inbox, describing a new report comparing the networking performance of the Cisco UCS vs the HP c Class blade systems. Both readers of the blog know I haven't been a fan of Cisco for a long time(about 10 years, since I first started learning about the alternatives), and I'm a big fan of HP c Class (again never used it, but planning on it). So as you could imagine I couldn't resist what it said considering the amount of hype that Cisco has managed to generate for their new systems(the sheer number of blog posts about it make me feel sick at times).

I learned a couple things from the report that I did not know about UCS before (I often times just write their solutions off since they have a track record of under performance, over price and needless complexity).

The first was that the switching fabric is external to the enclosure, so if two blades want to talk to each other that traffic must leave the chassis in order to do so, an interesting concept which can have significant performance and cost implications.

The second is that the current UCS design is 50% oversubscribed, which is what this report targets as a significant weakness of the UCS vs the HP c Class.

The mid plane design of the c7000 chassis is something that HP is pretty proud of(for good reason), capable of 160Gbps full duplex to every slot, totaling more than 5 Terrabits of fabric, they couldn't help but take shots at IBM's blade system and comment on how it is oversubscribed and how you have to be careful in how you configure the system based on that oversubscription when I talked to them last year.

This c7000 fabric is far faster than most high end chassis Ethernet switches, and should allow fairly transparent migration to 40Gbps ethernet when the standard arrives for those that need it. In fact HP already has 40Gbps Infiniband modules available for c Class.

The test involved six blades from each solution, when testing throughput of four blades both solutions performed similarly(UCS was 0.76Gbit faster). Add two more blades and start jacking up the bandwidth requirements. HP c Class scales linerally as the traffic goes up, UCS seems to scale lineraly in the opposite direction. End result is with 60Gbit of traffic being requested(6 blades @ 10Gbps), HP c Class managed to choke out 53.65Gbps, and Cisco UCS managed to cough up a mere 27.37Gbps. On UCS, pushing six blades at max performance actually resulted in less performance than four blades at max performance, significantly less. Illustrating serious weaknesses in the QoS on the system(again big surprise!).

The report mentions putting Cisco UCS in a special QoS mode for the test because without this mode performance was even worse. There is only 80Gbps of fabric available for use on the UCS(4x10Gbps full duplex). You can get a second fabric module for UCS but it cannot be used for active traffic, only as a backup.

UPDATE - A kind fellow over at Cisco took notice of our little blog here(thanks!!) and wanted to correct what they say is a bad test on the part of Tolly, apparently Tolly didn't realize that the fabrics could be used in active-active(maybe that complexity thing rearing it's head I don't know). But in the end I believe the test results are still valid, just at an incorrect scale. Each blade requires 20Gbps of full duplex fabric in order to be non blocking throughout. The Cisco UCS chassis provides for 80Gbps of full duplex fabric, allowing 4 blades to be non blocking. HP by contrast allows up to three dual port Flex10 adapters per half height server which requires 120Gbps of full duplex fabric to support at line rate. Given each slot supports 160Gbps of fabric, you could get another adapter in there but I suspect there isn't enough real estate on the blade to connect the adapter! I'm sure 120Gbps of ethernet on a single half height blade is way overkill, but if it doesn't radically increase the cost of the system, as a techie myself I do like the fact that the capacity is there to grow into.

Things get a little more complicated when you start talking about non blocking internal fabric(between blades) and the rest of the network, since HP designs their switches to support 16 blades, and Cisco designs their fabric modules to support 8. You can see by the picture of the Flex10 switch that there are 8 uplink ports on it, not 16, but it's pretty obvious that is due to space constraints because the switch is half width. END UPDATE

The point I am trying to make here isn't so much the fact that HP's architecture is superior to that of Cisco's. It's not that HP is faster than Cisco. It's the fact that HP is not oversubscribed and Cisco is. In a world where we have had non blocking switch fabrics for nearly 15 years it is disgraceful that a vendor would have a solution where six servers cannot talk to each other without being blocked. I have operated 48-port gigabit swtiches which have 256 gigabits of switching fabric, that is more than enough for 48 systems to talk to each other in a non blocking way. There are 10Gbps switches that have 500-800 gigabits of switching fabric allowing 32-48 systems to talk to each other in a non blocking way. These aren't exactly expensive solutions either. That's not even considering the higher end backplane and midplane based system that run into the multiple terrabits of switching fabrics connecting hundreds of systems at line rates.

I would expect such a poor design to come from a second tier vendor, not a vendor that has a history of making networking gear for blade switches for several manufacturers for several years.

So say take it worst case, what if you want completely non blocking fabric from each and every system? For me I am looking to HP c Class and 10Gbs Virtual Connect mainly for inttra chassis communication within the vSphere environment. In this situation with a cheap configuration on HP, you are oversubscribed 2:1 when talking outside of the chassis. For most situations this is probably fine, but say that wasn't good enough for you. Well you can fix it by installing two more 10Gbps switches on the chassis (each switch has 8x10GbE uplinks). That will give you 32x10Gbps uplink ports enough for 16 blades each having 2x10Gbps connections. All line rate, non blocking throughout the system. That is 320 Gigabits vs 80 Gigabits available on Cisco UCS.

HP doesn't stop there, with 4x10Gbps switches you've only used up half of the available I/O slots on the c7000 enclosure, can we say 640 Gigabits of total non-blocking ethernet throughput vs 80 gigabits on UCS(single chassis for both) ? I mean for those fans of running vSphere over NFS, you could install vSphere on a USB stick or SD card and dedicate the rest of the I/O slots to networking if you really need that much throughput.

Of course this costs more than being oversubscribed, the point is the customer can make this decision based on their own requirements, rather than having the limitation be designed into the system.

Now think about this limitation in a larger scale environment. Think about the vBlock again from that new EMC/Cisco/VMware alliance. Set aside the fact that it's horribly overpriced(I think mostly due to EMC's side). But this system is designed to be used in large scale service providers. That means unpredictable loads from unrelated customers running on a shared environment. Toss in vMotion and DRS, you could be asking for trouble when it comes to this oversubscription stuff, vMotion (as far as I know) relies entirely on CPU and memory usage. At some point I think it will take storage I/O into account as well. I haven't heard of it taking into account network congestion, though in theory it's possible. But it's much better to just have a non blocking fabric to begin with, you will increase your utilization, efficiency, and allow you to sleep better at night.

Makes me wonder how does Data Center Ethernet (whatever it's called this week?) hold up under these congestion conditions that the UCS suffers from? Lots of "smart" people spent a lot of time making Ethernet lossless only to design the hardware so that it will incur significant loss in transit. In my experience systems don't behave in a predictable manor when storage is highly constrained.

I find it kind of ironic that a blade solution from the world's largest networking company would be so crippled when it came to the network of the system. Again, not a big surprise to me, but there are a lot of Cisco kids out there I see that drink their koolaid without thinking twice, and of course I couldn't resist to rag again on Cisco.

I won't bother to mention the recent 10Gbps Cisco Nexus test results that show how easily you can cripple it's performance as well(while other manufacturers perform properly at non-blocking line rates), maybe will save that for another blog entry.

Just think, there is more throughput available to a single slot in a HP c7000 chassis than there is available to the entire chassis on a UCS. If you give Cisco the benefit of the second fabric module, setting aside the fact you can't use it in active-active, the HP c7000 enclosure has 32 times the throughput capacity of the Cisco UCS. That kind of performance gap even makes Cisco's switches look bad by comparison.

17Nov/09Off

HP VirtualConnect for Dummies

TechOps Guy: Nate

Don't know what VirtualConnect is? Check this e-book out. Available to the first 2,500 people that register. I just browsed over it myself it seems pretty good.

I am looking forward to using the technology sometime next year(trying to wait for the 12-core Opterons before getting another blade system). Certainly looks really nice on paper, and the price is quite good as well compared to the competition. It was first introduced I believe in 2006 so it's fairly mature technology.

3Nov/09Off

The new Cisco/EMC/Vmware alliance – the vBlock

TechOps Guy: Nate

Details were released a short time ago thanks to The Register on the vBlock systems coming from the new alliance of Cisco and EMC, who dragged along Vmware(kicking and screaming I'm sure). The basic gist of it is to be able to order a vBlock and have it be a completely integrated set of infrastructure ready to go, servers and networking from Cisco, storage from EMC, and Hypervisor from VMware.

vBlock0 consists of rack mount servers from Cisco, and unknown EMC storage, price not determined yet

vBlock1 consists 16-32 blade servers from Cisco and EMC CX4-480 storage system. Price ranges from $1M - 2.8M

vBlock2 consists of 32-64 blade servers from Cisco and an EMC V-MAX. Starting price $6M.

Sort of like FCoE, sounds nice in concept but the details fall flat on their face.

First off is the lack of choice. That is Cisco's blades are based entirely on the Xeon 5500s, which are, you guessed it limited to two sockets. And at least at the moment limited to four cores. I haven't seen word yet on compatibility with the upcoming 8-core cpus if they are socket/chip set compatible with existing systems or not(if so, wonderful for them..). Myself I prefer more raw cores, and AMD is the one that has them today(Istanbul with 6 cores, Q1 2010 with 12 cores). But maybe not everyone wants that so it's nice to have choice. In my view HP blades win out here for having the broadest selection of offerings from both Intel and AMD. Combine that with their dense memory capacity(16 or 18 DIMM slots on a half height blade), allows you up to 1TB of memory in a blade chassis in an afforadable confiugration using 4GB DIMMs. Yes Cisco has their memory extender technology but again IMO at least with a dual socket Xeon 5500 that it is linked to the CPU core:memory density is way outta whack. It may make more sense when we have 16, 24, or even 32 cores on a system using this technology. I'm sure there are niche applications that can take advantage of it on a dual socket/quad core configuration, but the current Xeon 5500 is really holding them back with this technology.

Networking, it's all FCoE based, I've already written a blog entry on that, you can read about my thoughts on FCoE here.

Storage, you can see how even with the V-MAX EMC hasn't been able to come up with a storage system that can start on the smaller end of the scale, something that is not insanely unaffordable to 90%+ of the organizations out there. So on the more affordable end they offer you a CX4. If you are an organization that is growing you may find yourself outliving this array pretty quickly. You can add another vBlock, or you can rip and replace it with a V-MAX which will scale much better, but of course the entry level pricing for such a system makes it unsuitable for almost everyone to try to start out with even on the low end.

I am biased towards 3PAR of course as both of the readers of the blog know, so do yourself a favor and check out their F and T series systems, if you really think you want to scale high go for a 2-node T800, the price isn't that huge, the only difference between a T400 and a T800 is the backplane. They use "blocks" to some extent, blocks being controllers(in pairs, up to four pairs), disk chassis(40 disks per chassis, up to 8 per controller pair I think). Certainly you can't go on forever, or can you? If you don't imagine you will scale to really massive levels go for a T400 or even a F400.  In all cases you can start out with only two controllers the additional cost to give you the option of an online upgrade to four controllers is really trivial, and offers nice peace of mind. You can even go from a T400 to a T800 if you wanted, just need to switch out the back plane (downtime involved). The parts are the same! the OS is the same! How much does it cost? Not as much as you would expect. When 3PAR announced their first generation 8-node system 7 years ago, entry level price started at $100k. You also get nice things like their thin built in technology which will allow you to run those eager zeroed VMs for fault tolerance and not consume any disk space or I/O for the zeros. You can also get multi level synchronous/asynchronous replication for a fraction of the cost of others. I could go on all day but you get the idea. There are so many fiber ports on the 3PAR arrays that you don't need a big SAN infrastructure just hook your blade enclosures directly to the array.

And as for networking hook your 10GbE Virtual Connect switches on your c Class enclosures to your existing infrastructure. I am hoping/expecting HP to support 10GbaseT soon, and drop the CX4 passive copper cabling. The Extreme Networks Summit X650 stands alone as the best 1U 10GbE (10GbaseT or SFP+) switch on the market. Whether it is line rate, or full layer 3, or high speed stacking, or lower power consuming 10GbaseT vs fiber optics,  or advanced layer 3 networking protocols to simplify management,  price and ease of use -- nobody else comes close. If you want bigger check out the Black Diamond 8900 series.

Second you can see with their designs that after the first block or two the whole idea of a vBlock sort of falls apart. That is pretty quickly your likely to just be adding more blades(especially if you have a V-MAX), rather than adding more storage and more blades.

Third you get the sense that these aren't really blocks at all. The first tier is composed of rack mount systems, the second tier is blade systems with CX4, the third tier is blade systems with V-MAX. Each tier has something unique which hardly makes it a solution you can build as a "block" as you might expect from something called a vBlock. Given the prices here I am honestly shocked that the first tier is using rack mount systems. Blade chassis do not cost much, I would of expected them to simply use a blade chassis with just one or two blades in it. Really shows that they didn't spend much time thinking about this.

I suppose if you treated these as blocks in their strictest sense and said yes we won't add more than 64 blades to a V-MAX, and add it like that you could get true blocks, but I can imagine the amount of waste doing something like that is astronomical.

I didn't touch on Vmware at all, I think their solution is solid, and they have quite a bit of choices. I'm certain with this vBlock they will pimp the enterprise plus version of software, but I really don't see a big advantage of that version with such a small number of physical systems(a good chunk of the reason to go to that is improved management with things like host profiles and distributed switches). As another blogger recently noted, Vmware has everything to lose out of this alliance, I'm sure they have been fighting hard to maintain their independence and openness, this reeks of the opposite, they will have to stay on their toes for a while when dealing with their other partners like HP, IBM, NetApp, and others..