TechOpsGuys.com

October 7, 2010

Testing the limits of virtualization

Filed under: Datacenter,Virtualization — Tags: 6100, amd, c-class, hp, opteron, vmware, vsphere — Nate @ 11:24 pm

You know I’m a big fan of the AMD Opteron 6100 series processor, also a fan of the HP c class blade system, specifically the BL685c G7 which was released on June 21st. I was and am very excited about it.

It is interesting to think, it really wasn’t that long ago that blade systems still weren’t all that viable for virtualization primarily because they lacked the memory density, I mean so many of them offered a paltry 2 or maybe 4 DIMM sockets. That was my biggest complaint with them for the longest time. About a year or year and a half ago that really started shifting. We all know that Cisco bought some small startup a few years ago that had their memory extender ASIC but well you know I’m not a Cisco fan so won’t give them any more real estate in this blog entry, I have better places to spend my mad typing skills.

A little over a year ago HP released their Opteron G6 blades, at the time I was looking at the half height BL485c G6 (guessing here, too lazy to check). It had 16 DIMM sockets, that was just outstanding. I mean the company I was with at the time really liked Dell (you know I hate Dell by now I’m sure), I was poking around their site at the time and they had no answer to that(they have since introduced answers), the highest capacity half height blade they had at the time anyways was 8 DIMM sockets.

I had always assumed that due to the more advanced design in the HP blades that you ended up paying a huge premium, but wow I was surprised at the real world pricing, more so at the time because you needed of course significantly higher density memory modules in the Dell model to compete with the HP model.

Anyways fast forward to the BL685c G7 powered by the Opteron 6174 processor, a 12-core 2.2Ghz 80W processor.

Load a chassis up with eight of those:

384 CPU cores (860Ghz of compute)
4 TB of memory (512GB/server w/32x16GB each)
6,750 Watts @ 100% load (feel free to use HP dynamic power capping if you need it)

I’ve thought long and hard over the past 6 months on whether or not to go 8GB or 16GB, and all of my virtualization experience has taught me in every case I’m memory(capacity) bound, not CPU bound. I mean it wasn’t long ago we were building servers with only 32GB of memory on them!!!

There is indeed a massive premium associated with going with 16GB DIMMs but if your capacity utilization is anywhere near the industry average then it is well worth investing in those DIMMs for this system, your cost of going from 2TB to 4TB of memory using 8GB chips in this configuration makes you get a 2nd chassis and associated rack/power/cooling + hypervisor licensing. You can easily halve your costs by just taking the jump to 16GB chips and keeping it in one chassis(or at least 8 blades – maybe you want to split them between two chassis I’m not going to get into that level of detail here)

Low power memory chips aren’t available for the 16GB chips so the power usage jumps by 1.2kW/enclosure for 512GB/server vs 256GB/server. A small price to pay, really.

So onto the point of my post – testing the limits of virtualization. When your running 32, 64, 128 or even 256GB of memory on a VM server that’s great, you really don’t have much to worry about. But step it up to 512GB of memory and you might just find yourself maxing out the capabilities of the hypervisor. At least in vSphere 4.1 for example you are limited to only 512 vCPUs per server or only 320 powered on virtual machines. So it really depends on your memory requirements, If your able to achieve massive amounts of memory de duplication(myself I have not had much luck here with linux it doesn’t de-dupe well, windows seems to dedupe a lot though), you may find yourself unable to fully use the memory on the system, because you run out of the ability to fire up more VMs ! I’m not going to cover other hypervisor technologies, they aren’t worth my time at this point but like I mentioned I do have my eye on KVM for future use.

Keep in mind 320 VMs is only 6.6VMs per CPU core on a 48-core server. That to me is not a whole lot for workloads I have personally deployed in the past. Now of course everybody is different.

But it got me thinking, I mean The Register has been touting off and on for the past several months every time a new Xeon 7500-based system launches ooh they can get 1TB of ram in the box. Or in the case of the big new bad ass HP 8-way system you can get 2TB of ram. Setting aside the fact that vSphere doesn’t go above 1TB, even if you go to 1TB I bet in most cases you will run out of virtual CPUs before you run out of memory.

It was interesting to see, in the “early” years the hypervisor technology really exploiting hardware very well, and now we see the real possibility of hitting a scalability wall at least as far as a single system is concerned. I have no doubt that VMware will address these scalability issues it’s only a matter of time.

Are you concerned about running your servers with 512GB of ram? After all that is a lot of “eggs” in one basket(as one expert VMware consultant I know & respect put it). For me at smaller scales I am really not too concerned. I have been using HP hardware for a long time and on the enterprise end it really is pretty robust. I have the most concerns about memory failure, or memory errors. Fortunately HP has had Advanced ECC for a long time now(I think I remember even seeing it in the DL360 G2 back in ’03).

HP’s Advanced ECC spreads the error correcting over four different ECC chips, and it really does provide quite robust memory protection. When I was dealing with cheap crap white box servers the #1 problem BY FAR was memory, I can’t tell you how many memory sticks I had to replace it was sick. The systems just couldn’t handle errors (yes all the memory was ECC!).

By contrast, honestly I can’t even think of a time a enterprise HP server failed (e.g crashed) due to a memory problem. I recall many times the little amber status light come on and I log into the iLO and say, oh, memory errors on stick #2, so I go replace it. But no crash! There was a firmware bug in the HP DL585G1s I used to use that would cause them to crash if too many errors were encountered, but that was a bug that was fixed years ago, not a fault with the system design. I’m sure there have been other such bugs here and there, nothing is perfect.

Dell introduced their version of Advanced ECC about a year ago, but it doesn’t (or at least didn’t maybe it does now) hold a candle to the HP stuff. The biggest issue with the Dell version of Advanced ECC was if you enabled it, it disabled a bunch of your memory sockets! I could not get an answer out of Dell support at the time at least why it did that. So I left it disabled because I needed the memory capacity.

So combine Advanced ECC with ultra dense blades with 48 cores and 512GB/memory a piece and you got yourself a serious compute resource pool.

Power/cooling issues aside(maybe if your lucky you can get in to SuperNap down in Vegas) you can get up to 1,500 CPU cores and 16TB of memory in a single cabinet. That’s just nuts! WAY beyond what you expect to be able to support in a single VMware cluster(being that your limited to 3,000 powered on VMs per cluster – the density would be only 2 VMs/core and 5GB/VM!)

And if you manage to get a 47U rack, well you can get one of those c3000 chassis in the rack on top of the four c7000 and get another 2TB of memory and 192 cores. We’re talking power kicking up into the 27kW range in a single rack! Like I said you need SuperNap or the like!

Think about that for a minute, 1,500 CPU cores and 16TB of memory in a single rack. Multiply that by say 10 racks. 15,000 CPU cores and 160TB of memory. How many tens of thousands of physical servers could be consolidated into that? A conservative number may be 7 VMs/core, your talking 105,000 physical servers consolidated into ten racks. Well excluding storage of course. Think about that! Insane! I mean that’s consolidating multiple data centers into a high density closet! That’s taking tens to hundreds of megawatts of power off the grid and consolidating it into a measly 250 kW.

I built out, what was to me some pretty beefy server infrastructure back in 2005, around a $7 million project. Part of it included roughly 300 servers in roughly 28 racks. There was 336kW of power provisioned for those servers.

Think about that for a minute. And re-read the previous paragraph.

I have thought for quite a while because of this trend, the traditional network guy or server guy is well, there won’t be as many of them around going forward. When you can consolidate that much crap in that small of a space, it’s just astonishing.

One reason I really do like the Opteron 6100 is the cpu cores, just raw cores. And they are pretty fast cores too. The more cores you have the more things the hypervisor can do at the same time, and there is no possibilities of contention like there are with hyperthreading. CPU processing capacity has gotten to a point I believe where raw cpu performance matters much less than getting more cores on the boxes. More cores means more consolidation. After all industry utilization rates for CPUs are typically sub 30%. Though in my experience it’s typically sub 10%, and a lot of times sub 5%. My own server sits at less than 1% cpu usage.

Now fast raw speed is still important in some applications of course. I’m not one to promote the usage of a 100 core CPU with each core running at 100Mhz(10Ghz), there is a balance that has to be achieved, and I really do believe the Opteron 6100 has achieved that balance, I look forward to the 6200(socket compatible 16 core). Ask anyone that has known me this decade I have not been AMD’s strongest supporter for a very long period of time. But I see the light now.

Comments (3)

October 6, 2010

Who’s next

Filed under: Networking,Random Thought — Tags: extremenetworks, force10, hp — Nate @ 9:42 pm

I was thinking about this earlier this week or late last week I forget.

It wasn’t long ago that IBM acquired Blade Network Technologies, a long time partner of IBM as Blade made a lot of switches for the Blade Center, and also for the HP blade system as well I believe.

I don’t think that Blade Networks was really well known outside of their niche of being a supplier to HP and IBM (and maybe others I don’t recall and haven’t checked recently) on the back end. I certainly never heard of them until in the past year or two and I do keep my eyes out there for such companies.

Anyways that is what started my train of thought. The next step in the process was watching several reports on CNBC about companies pulling their IPOs due to market conditions. Which to me is confusing considering how high the “market” has come recently. It apparently just boils down to investors and IPO companies not able to agree on a “market price” or whatever. I don’t really care what the reason is, but the point is this — earlier this year Force10 Networks filed for IPO, and well haven’t heard much of a peep since.

Given the recent fight over 3PAR between Dell and HP, and the continuing saga of stack wars, it got me speculating.

What I think should happen, is Dell should go buy Force10 before they IPO. Dell obviously has no networking talent in house, last I recall their Powerconnect crap was OEM’d from someone like SMC or one of those really low tier providers. I remember someone else making the decision to use that product last year, and then when we tried to send 5% of our network traffic to the site that was running those switches they flat out died, had to get remote hands to reboot them. Then shortly afterwards one of them bricked themselves when upgrading the firmware on them, had to RMA. I just pointed and laughed, since I knew it was a mistake to go with them to begin with, the people making the decisions just didn’t know any better. Several outages later they ended up replacing them, and I tought them the benefits of a true layer 3 network, no more static routes.

Then HP should go buy Extreme Networks, which is my favorite network switching company, I think HP could do well with them. Yes we all know HP bought 3COM last year, but we also know HP didn’t buy 3COM for the technology (no matter what the official company line is), they bought them for their presence in China. 3COM was practically a Chinese company by the time HP bought them, really! And yes I did read the news that HP finished kicking Cisco out of their data centers replacing their stuff with a combination of Procurve and 3COM. Juniper tried & failed to buy Extreme a few years ago shortly after they bought Netscreen.

That would make my day though, a c-Class blade system with an Extreme XOS-powered VirtualConnect Ethernet fabric combined with 3PAR storage on the back end. Hell, that’d make my year 🙂

And after that, given that HP bought Palm earlier in the year (yes I own a Palm Pre – mainly so I can run older Palm apps otherwise I’d still be on a feature phone). HP likes the consumer space so they should go buy Tivo and break into the set top box market. Did I mention I use Tivo too? I have 3 of them.

Comments (4)

Amazon EC2: Not your father’s enterprise cloud

Filed under: Datacenter — Tags: cloud, ec2 — Nate @ 9:00 am

OK, so obviously I am old enough that my father did not have clouds back in his days, well not the infrastructure clouds that are offered today. I just was trying to think of a somewhat zingy type of topic. And I understand enterprise can have many meanings depending on the situation, it could mean a bank that needs high uptime for example. In this case I use the term enterprise to signify the need for 24×7 operation.

Here I am, once again working on stuff related to “the cloud”, and it seems like everything “cloud” part of it revolves around EC2.

Even after all the work I have done recently and over the past year or two with regards to cloud proposals, I don’t know why it didn’t hit me until probably in the past week or so but it did (sorry if I’m late to the party).

There are a lot of problems with running traditional infrastructure in the Amazon cloud, as I’m sure many have experienced first hand. The realization that occured to me wasn’t that of course.

The realization was that there isn’t a problem with the Amazon cloud itself, but there is a problem with how it is:

Marketed
Targeted

Which leads to people using the cloud for things it was not intended to ever be used for. In regards to Amazon, one has to look no further than their SLA on EC2 to immediately rule it out for any sort of “traditional” application which includes:

Web servers
Database servers
Any sort of multi tier application
Anything that is latency sensitive
Anything that is sensitive to security
Really, anything that needs to be available 24×7

Did you know that if they lose power to a rack, or even a row of racks that is not considered an outage? It’s not as if they provide you with the knowledge of where your infrastructure is in their facilities, they rather you just pay them more and put things in different zones and regions.

Their SLA says in part that they can in fact lose an entire data center (“availability zone”), and that’s not considered an outage.Â Amazon describes this as an “availability zone”

Additionally, they are physically separate, such that even extremely uncommon disasters such as fires, tornados or flooding would only affect a single Availability Zone.

And while I can’t find it on their site at the moment, I swear not too long ago their SLA included a provision that said even if they lost TWO data centers it’s still not an outage unless you can’t spin up new systems in a THIRD. Think of how many hundreds to thousands of servers are knocked off line when an Amazon data center becomes unavailable. I think they may of removed the two availability zones clause because not all of their regions have more than two zones(last I checked only us-east did, but maybe more have them now).

I was talking to someone who worked at Amazon not too long ago and had in fact visited the us-east facilities, and said all of the availability zones were in the same office park, really quite close to each other. They may of had different power generators and such, but quite likely if a tornado or flooding hit, more than one zone would be impacted, likely the entire region would go out(that is Amazon’s code word for saying all availability zones are down). While I haven’t experienced it first hand I know of several incidents that impacted more that one availability zone, indicating that there is more things shared between them than customers are led to believe.

Then there is the extremely variable performance & availability of the services as a whole. On more than one occasion I have seen Amazon reboot the underlying hardware w/o any notification (note they can’t migrate the work loads off the machine! anything on the machine at the time is killed!).Â I also love how unapologetic they are when it comes to things like data loss. Basically they say you didn’t replicate the data enough times, so it’s your fault. Now I can certainly understand that bad things happen from time to time, that is expected, what is not expected though is how they handle it. I keep thinking back to this article I read on The Register a couple years ago, good read.

Once you’re past that, there’s the matter of reliability. In my experience with it, EC2 is fairly reliable, but you really need to be on your shit with data replication, because when it fails, it fails hard. My pager once went off in the middle of the night, bringing me out of an awesome dream about motorcycles, machine guns, and general ass-kickery, to tell me that one of the production machines stopped responding to ping. Seven or so hours later, I got an e-mail from Amazon that said something to the effect of:

There was a bad hardware failure. Hope you backed up your shit.

Look at it this way: at least you don’t have a tapeworm.

-The Amazon EC2 Team

I’m sure I have quoted it before in some posting somewhere, but it’s such an awesome and accurate description.

So go beyond the SLAs, go beyond the performance and availability issues.

Their infrastructure is “built to fail” which is a good concept at very large scale, I’m sure every big web-type company does something similar. The concept really falls apart at small scale though.

Everyone wants to get to the point where they have application level high availability and abstract the underlying hardware from both a performance and reliability standpoint. I know that, you know that. But what a lot of the less technical people don’t understand is that this is HARD TO DO. It takes significant investments in time & money to pull off. And at large scale these investments do pay back big. But at small scale they can really hurt you. You spend more time building your applications and tools to handle unreliable infrastructure when you could be spending time adding the features that will actually make your customers happy.

There is a balance there, as with anything. My point is that with the Amazon cloud those concepts are really forced upon you, if you want to use their service as a more “traditional” hosting model. And the overhead associated with that is ENORMOUS.

So back to my point as to the problem isn’t with Amazon itself, it’s with whom it is targeted to and the expectations around it. They provide a fine service, if you use it for what it was intended. EC2 stands for “elastic compute”, the first thing that comes to my mind when I hear that kind of term I think of HPC-type applications, data processing, back end type stuff that isn’t latency sensitive, and is more geared towards infrastructure failure.

But even then, that concept falls apart if you have a need for 24×7 operations. The cost model even of Amazon, the low cost “leader” in cloud computing doesn’t hold water vs doing it yourself.

Case in point, earlier in the year at another company I was directed to go on another pointless expedition comparing the Amazon cloud to doing it in house for a data intensive 24×7 application. Not even taking into account the latency introduced by S3, operational overhead with EC2, performance and availability problems. Assuming everything worked PERFECTLY, or at least as good as physical hardware – the ROI for the project for keeping it in house was less than 7 months(I re-checked the numbers and revised the ROI from the original 10 months to 7 months, I was in a hurry writing this morning before work). And this was for good quality hardware with 3 years of NBD on site support. This wasn’t scraping bottom of the barrel. To give you an idea on the savings after those 7 months it could more than pay for my yearly salary and benefits, and other expenses a company has for an employee for each and every month after that.

OK so we’re passed that point now. Onto a couple of really cool slides I came up for a pending presentation, which I really thing illustrate the Amazon cloud quite well, another one of those “picture is worth fifty words” type of thing. The key point here is capacity utilization.

What has using virtualization over the past half decade (give or take..) taught us? What has the massive increases in server and storage capacity taught us? Well they taught me that applications no longer have the ability to exploit the capacity of the underlying hardware. There are very rare exceptions to this but in general over the pastÂ I would say at least 15 years of my experience applications really have never had the ability to exploit the underlying capacity of the hardware. How many systems do you see averaging under 5% cpu? Under 3%? Under 2% ? How many systems do you see with disk drives that are 75% empty? 80%?

What else has virtualization given us? It’s given us the opportunities to logically isolate workloads into different virtual machines, which can ease operational overhead associated with managing such workloads, both from a configuration standpoint as well as a capacity planning standpoint.

That’s my point. Virtualization has given us the ability to consolidate these workloads onto fewer resources. I know this is a point everyone understands I’m not trying to make people look stupid, but my point here with regards to Amazon is their model doesn’t take us forward — it takes us backward. Here are those two slides that illustrate this:

(Click image for full size)

And the next slide

(Click image for full size)

Not all cloud providers are created equal of course. The Terremark Enterprise cloud (not vCloud Express mind you), for example is resource pool based. I have no personal experience with their enterprise cloud (I am a vCloud express user for my personal stuff[2x1VCPU servers – including the server powering this blog!]). Though I did interact with them pretty heavily earlier in the year on a big proposal I was working on at the time. I’m not trying to tell you that Terremark is more or less cost effective, just that they don’t reverse several years of innovation and progress in the infrastructure area.

I’m sure Terremark is not the only provider that can provide resources based on resource pools instead of hard per-VM allocations. I just keep bringing them up because I’m more familiar with their stuff due to several engagements with them at my last company(none of which ever resulted in that company becoming a customer). I originally became interested in Terremark because I was referred to them by 3PAR, and I’m sure by now you know I’m a fan of 3PAR, Terremark is a very heavy 3PAR user. And they are a big VMware user, and you know I like VMware by now right?

If Amazon would be more, what is the right word, honest? up front? Better at setting expectations I think their customers would be better off, mainly they would have less of them because such customers would realize what that cloud is made for. Rather than trying to fit a square peg in a round hole. If you whack it hard enough you can usually get it in, but well you know what I mean.

As this blogÂ entry exceeds 1,900 words now I feel I should close it off. If you read this far, hopefully I made some sense to you. I’d love to share more of my presentation as I feel it’s quite good but I don’t want to give all of my secrets away 🙂

Thanks for reading.

Comments Off

October 5, 2010

HP Launches new denser SL series

Filed under: Datacenter,News — Nate @ 11:49 am

[domain name transfer still in progress but at least for now I managed to update the name servers to point to mine so the blog is being directed to the right server now]

Getting closer! Not quite there yet though.

Earlier in the year I was looking at the HP SL6000 series of systems for a project that needed high efficiency and density.

The biggest drawback to the system in my opinion is it wasn’t dense enough, it was no denser than 1U servers for the configuration I was looking at (needing 4×3.5″ drives per system). It was more power efficient though, and hardware serviceability was better.

The limitation was in the chassis, and HP acknowledged this at the time saying they were working on a new and improved version but it wasn’t available at the time. Well looks like they have launched it today, in the form of the SL6500. It seems to deliver(on the statements HP gave to me earlier in the year), I don’t see much info on the chassis itself on their site but looks significantly more dense, with the key here being the chassis is a lot deeper than the original 2U.

But they still have a ways to go, as far as I know the SGI Cloudrack C2 is the density leader in this space, at least from material that is publically available, who knows what the likes of IBM/Dell/HP come up with behind the scenes for special customers.

I did, what was to me a pretty neat comparison earlier this year comparing the power efficiency of the Cloudrack against the 3PAR T-class storage enclosures (granted the density technology behind the 3PAR is 8 years old at this point they haven’t felt the need to go more dense, though HP may encourage them to since they waste up to 10U of space in each of the racks but weight and power can become issues in many facilities going even as dense as 3PAR can go).

Anyways, onto the comparison, this is one place where the picture tells the story, pretty crazy huh? Yeah I know the products are aimed at very diffierent markets, I just thought it was a pretty crazy comparison.

You can think of the Cloudrack as one giant chassis. The rack is the chassis(literally). So while HP has gone from a 2U chassis to a 4U chassis, SGI is waiting for them with a 38U chassis. Another nice advantage of the Cloudrack is you can get true N+1 power (3 diverse power sources), most systems can only support two power sources, the Cloudrack can go much, MUCH higher. And with the power supplies built into the chassis, the servers can benefit from that extra fault tolerance and high efficiency(no fans or power supplies in the servers! Same as the HP SL series)

Comments (4)

Keeping TechOpsGuys around a bit longer

Filed under: General — Nate @ 7:52 am

Well before the domain transferred Robin from StorageMojo sent a good comment my way and it made sense. He’s a much more experienced blogger than me so I decided to take his advice and do a couple of things:

Keep the TechOpsGuys name for now – even though it’s just me – until I manage to find something better
Keep the original layout – it annoys me but I can live with it with the Firefox zoom feature(zoomed in 150%)

Thanks Robin for the good suggestions, (I don’t know enough about MySQL to recover the data since I did the original migration)

Maybe someone else will join my blogging in the future..

The old TechOpsGuys is officially dead.. Well you may be able to hit it if you have the IP (not that you care!),Â my former partners in crime are welcome to contribute to the site still if they want.

I’ll bring up www.techopsguys.com again probably this weekend to rave about non technical topics, so I can keep this site technical..since I run the server and can run as many blogs as I want! Well as many as I have time to..

Comments (2)

October 3, 2010

Welcome to the new site

Filed under: General — Nate @ 1:43 pm

Hey there, new blog site..migrated data from http://www.techopsguys.com/ (well my posts at least). Let me know if you see anything that’s really broken. I had to edit a bunch of sql to change the names,paths, etc. Put in symlinks to fix other things.. but I think it’s working…new theme too! Myself I like to read things that are easier to read in low(er) light levels, white is very..bright! hurts my eyes

Comments (3)

September 27, 2010

Bye Bye 3PAR, Hello HP!

Filed under: News,Storage — Tags: 3par, hp, Storage — Nate @ 2:14 pm

Wow that was fast! HP completed it’s purchase of 3PAR this morning.

HP today announced that it has completed the acquisition of 3PAR Inc., a leading global provider of utility storage, for a price of $33 per share in cash, or an enterprise value of $2.35 billion.

3PAR technologies expand HPâ€™s storage portfolio into enterprise-class public and private cloud computing environments, which are key growth markets for HP. Complementary with HPâ€™s current storage portfolio, 3PAR brings market-differentiating technology to HP that will enable clients to maximize storage utilization, balance workloads and automate storage tiering. This allows clients to improve productivity and more efficiently operate their storage networks.

With a worldwide sales and channel network, coupled with extensive service operations, HP is uniquely positioned to rapidly expand 3PARâ€™s market opportunity. As part of the HP Converged Infrastructure portfolio, which integrates servers, storage, networking and management technologies, 3PAR solutions will further strengthen HPâ€™s ability to simplify data center environments for clients.

Further details on product integration will be announced at a later date.

Certainly not messing around!

Comments Off

September 26, 2010

Still waiting for Xiotech..

Filed under: Random Thought,Storage — Tags: 3par, spc-1, Storage, xiotech — Nate @ 2:55 pm

So I was browsing the SPC-1 pages again to see if there was anything new and lo and behold, Xiotech posted some new numbers.

But once again, they appear too timid to release numbers for their 7000 series, or the 9000 series that came out somewhat recently. Instead they prefer to extrapolate performance from their individual boxes and aggregate the results. That doesn’t count of course, performance can be radically different at higher scale.

Why do I mention this? Well nearly a year ago their CEO blogged, in response to one of my posts, and that was one of the first times I made news in The Register (yay! – I really was excited) , and in part the CEO said:

Responding to the Techopsguy blog view that 3PAR’s T800 outperforms an Emprise 7000, the Xiotech writer claims that Xiotech has tested “a large Emprise 7000 configuration” on what seems to be the SPC-1 benchmark; “Those results are not published yet, but we can say with certainty that the results are superior to the array mentioned in the blog (3PAR T800) in several terms: $/IOP, IOPS/disk and IOPS/controller node, amongst others.”

So here we are almost a year later, and more than one SPC-1 result later, and still no sign of Xiotech’s SPC-1 numbers for their higher end units. I’m sorry but I can’t help but feel they are hiding something.

If I were them I would put my customers more at ease by publishing said numbers, and be prepared to justify the results if they don’t match up to Xiotech’s extrapolated numbers from the 5000 series.

Maybe they are worried they might end up like Pillar, who’s CEO was pretty happy with their SPC-1 results. Shortly afterwards the 3PAR F400 launched and absolutely destroyed the Pillar numbers from every angle. You can see more info on these results here.

At the end of the day I don’t care of course, it just was a thought in my head and gave me something to write about 🙂

I just noticed that these past two posts puts me over the top as far as the most number of posts I have done in a month since this TechOpsGuys things started. I’m glad I have my friends Dave, Jake and Tycen generating tons of content too, after all this site was their idea!

Comments Off

Overhead associated with scale out designs

Filed under: Random Thought — Tags: google — Nate @ 2:33 pm

Was reading a neat article over at The Register again about the new Google indexing system. This caught my eye:

“The TPC-E results suggest a promising direction for future investigation. We chose an architecture that scales linearly over many orders of magnitude on commodity machines, but weâ€™ve seen that this costs a significant 30-fold overhead compared to traditional database architectures.

Kind of makes you think… I guess if your operating at the scale they are, the overhead is not a big deal, they’ll probably a find a way to reduce(ha ha, map reduce, get it? sorry) it over time.

Comments Off

September 23, 2010

Using open source: how do you give back?

Filed under: General,linux,Random Thought — Tags: linux — Nate @ 10:11 pm

After reading an article on The Register (yeah you probably realize by now I spend more time on that site online than pretty much any other site), it got me thinking about a topic that bugs me.

The article is from last week but is written by the CEO of the organization behind Ubuntu. It basically talks about how using open source software is a good way to save costs in a down(or up) economy. And tries to give a bunch of examples on companies basing their stuff on open source.

That’s great, I like open source myself, fired up my first Slackware Linux box in 1996 I think it was(Slackware 3.0). I remember picking Slackware over Red Hat at the time specifically because Slackware was known to be more difficult to use and it would force me to learn Linux the hard way, and believe me I learned a lot. To this day people ask me what they should study or do to learn Linux and I don’t have a good answer, I don’t have a quick and easy way to learn Linux the way I learned it. It takes time, months, years of just playing around with it. With so many “easy” distributions these days I’m not sure how practical my approach is now but I’m getting off topic here.

So back to what bugs me. What bugs me is people out there, or more specifically organizations out there that do nothing but leach off of the open source community. Companies that may make millions(or billions!) in revenue in large part because they are leveraging free stuff. But it’s not the usage of the free stuff that I have a problem with, more power to them. I get annoyed when those same organizations feel absolutely no moral obligation to contribute back to those that have given them so much.

You don’t have to do much. Over the years the most that I have contributed back have been participating in mailing lists, whether it is the Debian users list(been many years since I was active there), or the Red Hat mailing list(few years), or the CentOS mailing list(several months). I try to help where I can. I have a good deal of Linux experience, which often means the questions I have nobody else on the list has answers to. But I do(well did) answer a ton of questions. I’m happy to help. I’m sure at some point I will re-join one of those lists(or maybe another one) and help out again, but been really busy these past few months. I remember even buying a bunch of Loki games to try to do my part in helping them(despite it not being open source, they were supporting Linux indirectly). Several of which I never ended up playing(not much of a gamer). VMware of course was also a really early Linux supporter(still have my VMware 1.0.2 linux CD I believe that was the first version they released on CD previous versions were download only), though I have gotten tired of waiting for vCenter for Linux.

The easiest way for a corporation to contribute back is to say use and pay for Red Hat Enterprise, or SuSE or whatever. Pay the companies that hire the developers to to make the open source software go. I’m partial to Red Hat myself at least in a business environment, though I use Debian-based in my personal life.

There are a lot of big companies that do contribute code back, and that is great too, if you have the expertise in house. Opscode is one such company I have been working with recently on their Chef product. They leverage all sorts of open source stuff in their product(which in itself is open source). I asked them what their policy is for getting things fixed in the open source code they depend on, do they just file bugs and wait or do they contribute code, and they said they contribute a bunch of code, constantly. That’s great, I have enormous respect for organizations that are like that.

Then there are the companies that leach off open source and not only don’t officially contribute in any way whatsoever but they actively prevent their own employees from doing so. That’s really frustrating & stupid.

Imagine where Linux, and everything else would be if more companies contributed back. It’s not hard, go get a subscription to Red Hat, or Ubuntu or whatever for your servers (or desktops!). You don’t have to contribute code, and if you can’t contribute back in the form of supporting the community on mailing lists, or helping out with documentation, or the wikis or whatever. Write a check, and you actually get something in return, it’s not like it’s a donation. But donations are certainly accepted by the vast numbers of open source non profits

HP has been a pretty big backer of open source for a long time, they’ve donated a lot of hardware to support kernel.org and have been long time Debian supporters.

Another way to give back is to leverage your infrastructure, if you have a lot of bandwidth or excess server capacity or disk space or whatever, setup a mirror, sponsor a project. Looking at the Debian page as an example it seems AboveNet is one such company.

I don’t use open source everywhere, I’m not one of those folks who has to make sure everything is GPL or whatever.

So all I ask, is the next time you build or deploy some project that is made possible by who knows how many layers of open source products, ask yourself how you can contribute back to support the greater good. If you have already then I thank you 🙂

Speaking of Debian, did you know that Debian powers 3PAR storage systems? Well it did at one point I haven’t checked recently, I do recall telnetting to my arrays on port 22 and seeing a Debian SSH banner. The underlying Linux OS was never exposed to the user. And it seems 3PAR reports bugs, which is another important way to contribute back. And, as of 3PAR’s 2.3.1 release(I believe) they finally officially started supporting Debian as a platform to connect to their storage systems. By contrast they do not support CentOS.

Extreme Networks’s ExtremeWare XOS is also based on Linux, though I think it’s a special embedded version. I remember in the early days they didn’t want to admit it was Linux they said “Unix based”. I just dug this up from a backup from back in 2005, once I saw this on my core switch booting up I was pretty sure it was Linux!

Extreme Networks Inc. BD 10808 MSM-R3 Boot Monitor
Version 1.0.1.5 Branch mariner_101b5 by release-manager on Mon 06/14/04
Copyright 2003, Extreme Networks, Inc.
Watchdog disabled.
Press and hold the <spacebar> to enter the bootrom.

Boot path is /dev/fat/wd0a/vmlinux
(elf)
0x85000000/18368 + 0x85006000/6377472 + 0x8561b000/12752(z) + 91 syms/
Running image boot…

Starting Extremeware XOS 11.1.2b3
Copyright (C) 1996-2004 Extreme Networks.Â All rights reserved.
Protected by U.S. Patents 6,678,248; 6,104,700; 6,766,482; 6,618,388; 6,034,957

Then there’s my Tivo that runs Linux, my TV runs Linux(Phillips TV), my Qlogic FC switches run Linux, I know F5 equipment runs on Linux, my phone runs Linux(Palm Pre). It really is pretty crazy how far Linux has come in the past 10 years. And I’m pretty convinced the GPL played a big part, making it more difficult to fork it off and keep the changes for yourself. A lot of momentum built up in Linux and companies and everyone just flocked to it. I do recall early F5 load balancers used BSDI, but switched over to Linux (didn’t the company behind BSDI go out of business earlier this decade? or maybe they got bought I forget). Seems Linux is everywhere and in most cases you never notice it. The only way I knew it was in my TV is because of the instructions came with all sorts of GPL disclosures.

In theory the BSD licensing scheme should make the *BSDs much more attractive, but for the most part *BSD has not been able to keep pace with Linux(outside some specific niches I do love OpenBSD‘s pf) so never really got anywhere close to the critical mass Linux has.

Of course now someone will tell me some big fancy device that runs BSD that is in every data center, every household and I don’t know it’s there! If I recall right I do remember that Juniper’s JunOS is based on FreeBSD? And I think Force10 uses NetBSD.

Also recall being told by some EMC consultants back in 2004/2005 that the EMC Symmetrix ran Linux too, I do remember the Clariions of the time(at least, maybe still) ran Windows(probably because EMC bought the company that made that product rather than creating it themselves)

Comments Off

« Newer Posts — Older Posts »

TechOpsGuys.com Diggin' technology every day

October 7, 2010

October 6, 2010

October 5, 2010

October 3, 2010

September 27, 2010

September 26, 2010

September 23, 2010