TechOpsGuys.com Diggin' technology every day

September 4, 2015

Containers – my experiences good and bad

Filed under: linux — Tags: — Nate @ 6:46 pm

This is a followup post to an earlier post I had responding to container hype (more specifically perhaps Docker hype).

I want to give some of my (albeit limited) real-world experience with containers(that play a part in generating well north of two hundred million a year in revenue) the good and the bad, and how I decided to make use of them and where I see using them in the future.

I wrote a lot of this in a comment on el reg not too long ago so thought to more formalize it here so I can refer people to it if needed. Obviously I have much better control over formatting on a blog than a comment box.

The case for containers

The initial use case for containers at my organization was very targeted at one specific web application. From a server perspective up until this point we were 100% virtualized under VMware ESXi Enterprise Plus.

This web application drives the core e-commerce engine of the business, it is a commercial product (though an open source version exists), and the license cost is north of $10,000/year per installation. So for example if you have a VMware server with 5 VMs on it, each running this application in production you will pay north of $50,000/year in license fees/support for those 5 VMs. There is no license model where they license per CPU, or per CPU core, or per physical host(at this time anyway).

The application can be very CPU hungry, and in the earliest days we ran the application stack in the Amazon cloud. In early 2012 we moved out and ran it in house on top of VMware. We allocated something like 4 vCPUs per web server running this application. We had 4 web servers active at any given time, though we had the ability to double that capacity very quickly if required.

Farm based software deployment

It was decided early on before I joined the company that the deployment model for the applications would be “farm” based. That is we would have two “banks” of servers “A” and “B”. Generally one bank would be active at any given time, and to deploy code we would deploy to the inactive servers and then “flip farms”, basically change load balancing routing to point users at the servers with the new code. While in Amazon the line of thinking is we would “spin up” (on demand) new servers, deploy to them, make them live, then terminate the original servers(to save $$). Rinse & Repeat. Reality set in and this never happened, the farms stayed up all the time (short of Amazon failures which were very frequent(relative to current failures anyway)).

This model of farm deployments is the same model used at my previous company (with the original Ops director being the same person so not a big surprise). Obviously it’s not the only way to deploy (it’s the only two places I’ve worked at that deploy in this manor), but it works fine. My focus really is not on application deployment so I have not had an interest in pushing to use another model.

When we moved to the data center, the cost of managing both farms was not much, inactive farms used very little CPU, disk space was a non issue(I have perfected log rotation and retention over the years combined with LVM disk management to maximize efficiency of thin provisioning on 3PAR, it runs really well). Memory was a factor to some degree but at the end of the day it wasn’t a big deal.

Having the 2nd farm always running had another benefit. We could, on very short notice activate the 2nd farm and essentially double our production server capacity. We did this(and continue to) for high load events. Obviously it does impact the ability to deploy code when in this situation but we adapted to that a long long time ago.

One big benefit of the farm approach is it makes rollbacks of application code very quick(10-30 seconds). The applications involved generally aren’t expected to operate with mixed versions of the application running simultaneously(obviously depends on the extent of the changes).

The process today which manages activating both “farms” simultaneously does perform a check of the application code on both farms and will not allow them both to go active if they do not match.

Scaling the application

As a year or two passed the CPU requirements of the application grew (in part due to traffic growth also in part due to bad code etc). We found ourselves during our high traffic time two years ago keeping both “farms” active for months at a time(making short exceptions for code deployment), to try to ensure we had sufficient capacity. This worked, but it wasn’t the most cost effective model to grow to, as traffic continued to rise, I wanted something (much) faster without breaking the bank.

Moving to physical hardware

Although we were 100% virtualized I did think a good strategy for this application was to move to physical hardware, for two main reasons:

  • Eliminate any overhead from hypervisor
  • I wanted to dedicate entire physical servers to this application, paying VMware license fees for basically a single application on one server seemed like a waste of $

I did not entertain the option of using one of the free hypervisors for four reasons:

  • Didn’t want overhead from the hypervisor
  • Nobody in the organization had solid experience with any other hypervisor
  • Didn’t want another technology stack to manage separately, just needless complexity
  • Xen and KVM aren’t nearly as solid as VMware, just not enough to consider using them for this use case anyway.

So my line of thinking early on wasn’t containers, it was more likely a single OS image, with custom application configurations, and directory structures, two apache instances (one for each “farm”) on each server, and the load balancer would just switch between the apache instances when “flipping farms”. I have done this before to some extent as mentioned in the previous article on containers. It didn’t take long for me to kinda-sorta rule this out as a good idea for a couple of reasons:

  • The application configuration was going to be somewhat unique relative to all other environments (unless we changed all of them which was possible, quite a bit more work though)
  • Not entirely sure how easy it was going to be to get the application to run from two different paths and ensure that it operates correctly (maybe it would of been easy I don’t know)

So at some point the ideas of containers hit me and I decided to explore that as an option.

Benefit of containers for this use case

  • LXC being built into our existing Ubuntu 12.04 LTS operating system
  • Easily runs on physical hardware
  • “Partitions” the operating system into multiple instances so that they have their own directory structures eliminating the need to have to reconfigure applications to work from a funky layout.
  • Allows me to scale a single container to the entire physical CPU horsepower of the server automatically, while limiting memory usage so the physical host does not run out of memory
  • Allows me to maintain two containers on each host (one for each “farm”), and eliminates the need to “activate both farms” for capacity since all of the capacity is already available.
  • Eliminates $10,000+ fee of VMware licensing
  • Slashes $10,000+/year fee of application by slashing the number of systems required to run it in production now and in the future.
  • Eliminates overhead of hypervisor
  • Eliminates dependency on SAN storage
  • Massive increase in available capacity, roughly 8 X the capacity of the previously virtualized “farm” of servers (or 4X the capacity of both farms combined). Means years of room to grow into without having to think about it.

Limited use case

This is a very targeted deployment for containers. This is a highly available production web application where each server is basically an exact copy of each other. Obviously this means if one physical host or container fails the others continue processing without skipping a beat. There are three physical hosts in this case (HP DL380Gen8 with dual Xeon 2695v2 CPUs (24 cores / 48 threads)- I find it amusing to run top, and tell it to show me all CPUs and it says “Sorry, terminal is not big enough“), and only one is required for current production loads(even on a high traffic day).

These systems are dedicated to this application. You might think when launching these on day one and seeing the CPU usage of the application go from ~45% to under 5% would make me say, oh what a waste of hardware resources let’s pile more containers on this. No way. We saved an enormous amount of costs in licensing for this application by doing this, well enough to pay for the servers quite quickly. We also have capacity for a long time to come, and can handle any bursts in traffic without a worry.  It was a concept that turned into a great success story for containers at my organization.

I gave a benefit of eliminating dependency on SAN storage as a bonus, these are the first physical servers that this organization has deployed with internal storage. Everything else is boot from SAN(Like I am going to trust a $5 piece crap USB flash memory stick for a hypervisor when I have multipath fibre channel available likewise goes for having internal disks in the servers just for a tiny hypervisor). Obviously the big benefit of shared storage is being able to vmotion between hosts. Can’t do that with containers(as far as I am aware anyway), so we put 5 disks in each server 4 of them in RAID 10 with one hot spare and 1GB of battery backed write cache.

So while I love my SAN storage, in this case it wasn’t needed, so we aren’t using it. Saved some costs and complexity on fibre channel cards and connectivity etc(not really an iSCSI fan for production systems).

I did somewhat dread the driver situation going to physical hardware, my last experiences with physical hardware with Linux several years ago were kind of frustrating with the drivers, I remember many times having to build custom kickstart disks for NIC drivers or storage drivers etc.. Fortunately this time around the stock drivers worked fine.

We also saved costs on networking, all of our VMware hosts each have two dual port 10GbE cards, along with 2x1Gbps ports for management(total 11 cables coming out of each server). The container hosts since they really only have one container active at a time rely just on the 2x1Gbps ports, more than enough for a single container(total 5 cables coming out of each server).

No rapid build up or tear down

The original containers have been running continuously (short of a couple of reboots, and some OS patches) for well over a year at this point. They do not have a short life span.

Downsides to containers

No technology is perfect of course, and I did fairly quickly come across some very annoying limitations of container technology inside the Linux kernel, which prevents me from making containers a more general purpose replacement for VMs. Maybe now some of these issues are resolved, I am not sure, I don’t run bleeding edge kernels etc.

  • autofs does not function inside containers. We use autofs for lots of NFS mount points, and not having it operate is very annoying. It was a documented kernel limitation when we deployed containers last year, since we are on the same general kernel version today I don’t believe that has changed for us anyway.
  • Memory capacity is not correctly reported by the container. If the host has 64GB of memory, and the container is limited to 32GB of memory, all of the general linux tools inside the container all report 64GB of memory available, again, annoying, and I imagine this means the container doesn’t handle out of memory situations too gracefully as it has no idea it is about to run out before it hits the wall.
  • Likewise querying per-container CPU usage using standard linux tools is impossible. Everything reports the same CPU usage whether it is the host, the active container on the host, or the idle container on the host.
  • Running containers that span multiple subnets simultaneously is extremely difficult and complicated. I have probably a dozen different VLANs on VMware hosts each on different subnets, each with different default gateways etc. The routing exists in the Linux kernel and having more than one default gateway is a real pain. I read last year it seemed to be technically possible, but the solution was not at all a practical one. So in the meantime, a host has to be dedicated to a single subnet.
  • Process listings on the container host is quite confusing, as it lists the processes for all of the containers as well, identifying which process is from where is confusing and annoying. Having to have custom monitors configured to say, on these hosts having 6 postfix processes is ok but everywhere else 1 is required, is annoying too. I’m sure there is probably lxc-specific tools that can do it but the point is the standard linux tools don’t handle this well at all.
  • Lack of ability to do things like move containers between hosts, some applications, and some environments can be made fully redundant so you can lose a VM/container and be ok. But many others are not. I literally have several hundred VMs each of which are single points of failure because most are development VMs and it is a waste to build redundancy into every development environment the resource requirements would explode. So having things like vmotion & VMware high availability, and even DRS for host affinity rules is very nice to have.

Any one of the above I would consider a deal breaker for large(r) scale deployments of containers at organizations I have worked for. Combine them all? What a mess.

There are other limitations as well, those are just the most severe I see.

Future uses of containers at my organization

I can see future uses of containers at my organization expanding in the production environment, targeting CPU hungry applications and putting them on physical hardware. Maybe even feel brave enough to host multiple applications on the same hardware knowing that I have no good insight into how much each application is using CPU wise(since all current monitoring is performed at the OS level not the application level). Time will tell.

I said earlier we continue to activate “both farms” even though we use containers. In the case of the container hosted application we do not ever activate both farms anymore, but we do have other production web applications that are farm based and living in VMware still, so those we do activate both farms for in anticipation(hopefully) or response to sudden increases in traffic.

Containers inside a hypervisor are a waste of time

In case it isn’t obvious it is my belief that the main point of using containers is to leverage the underlying hardware of server platform you are on, and removing the overhead and costs associated with the hypervisor where possible. Running containers within a hypervisor to me is a misguided effort. Of course I am sure there are people doing this in public clouds because they want to use containers but they are limited by what the “cloud” will give them (hence the original pro-docker article talking about this specific point).

I do not believe that containers themselves have any bearing on deployment of applications in any scenario. They are completely independent things. A container, from a high level (think CxO level) is functionally equivalent to a virtual machine, a concept we have had in the server world for over a decade at this point.

Deep down technically they are pretty different but the concept of segmenting a physical piece of hardware into multiple containers/VMs so that things don’t run over each other is nothing new (and it’s really really old if you get outside of the x86 world I believe IBM has been doing this kind of thing for 30+ years on big iron).

Good use cases for containers at hyper scale

At hyperscale (never having worked at such a scale but I get the gist of how some things operate), all math changes. Every decision is magnified 10,000x.

  • Suddenly saving 5 watts of power on a server is a big deal because you have 150,000 servers deployed.
  • Likewise the few percent of CPU and memory overhead provided by hypervisors can literally cost an organization millions of $ at high scale.
  • Yet alone licensing costs from the likes of VMware etc even with volume/enterprise deals.
  • The time required to launch a VM really is slow compared to launching a container, which again at scale that time really adds up.

There was an article I read last year that said google launches 2,000,000,000 containers per week. Maybe I have launched 4,000 VMs in the past decade – average 7.7 VMs per week(that is aiming really high too). So perspective is in order here. (yes I wanted to write out the 2 billion number that way, nicer perspective). 2 billion per week vs 8 per week, yeah, just slightly different scale here.

At scale you can obviously overcome the limitation of requiring multiple subnets on a server because you have fleets of systems, each fleet probably on various subnets, you’re so big you don’t need to be that consolidated. You probably have a good handle on application-level CPU and memory monitoring(not relying on monitoring of the VM/container as a whole), you probably don’t rely too much on NFS, but instead applications probably use a lot of object storage. You probably never login to the servers so you don’t care what the process list looks like. Your application is probably so fault tolerant that you don’t care about losing a host.

All of these are perfectly valid scenarios to have at a really big scale. But again most organizations will never, ever get to that scale. I’ll say again I believe firmly that trying to build for that level of scale from the outset is a mistake because you will very likely do it wrong, even if you think you know what you are doing.

I’ll use another example here, again taking from one of my comments from el reg recently. I had a job interview back in 2011 at a mid sized company in Seattle, they probably had a few hundred servers, and a half dozen to dozen or so people in the operations group(s). They had recently hired some random guy(random to me anyway) out of Amazon who proclaimed he was a core part of building the Amazon cloud (yet his own linkedin profile said he was just some random engineer there). He talked the talk, I obviously didn’t know him so it was hard to judge his knowledge based on a 1-2 hour interview with him. Our approaches were polar opposite to each other. I understood his approach(the Amazon way), and I understood my approach(the opposite). Each has value in certain circumstances. It was the only interview I’ve ever had where I was really close to just standing up and walking out. My ears were hot, I could tell I would not get along with this person. I kept my BS going though because I was looking for a new job.

The next day or the day after they offered me the job(apparently this guy liked me a lot), I declined politely and accepted the position I am at now and relocated to the bay area a couple of months later.

I had friends who knew this company and kept me up to date on what was going on over there. This guy wanted to build an Amazon cloud at this company. An ambitious goal to be sure, I believed firmly they weren’t going to be able to do it, but this guy believed they could. So they went down the procurement route, and it was rough going. At one point their entire network team quit en-masse because they did not agree with what this guy was doing. He was basically trying to find the cheapest hardware money could buy and wanted to make it “cloud”. He was clueless but their management bought into his BS for some time. He wrecked the group, and within a year I want to say I was informed that not only was he fired but he was escorted out of the building. The company paid through the nose to hire a new team because word got around, nobody wanted to work there. Last I heard they were doing well, had long abandoned the work this person had tried to do.

He had an idea, he had some experience, he knew what he wanted to do. He didn’t realize the organization lacked the ability to execute on that vision. I realized this during my one day interview there but he had no idea, or didn’t care (maybe he thought if they just work hard enough they can make it work).

Anyway perhaps an extreme example, but one that remains fresh in my mind.

Conclusion

Simply trying to do something just because Amazon, or Google(hello hipster Hadoop users from the past decade) or even Microsoft is doing it doesn’t automatically make it a good idea for your organization, you’ve got to have the ability to execute on it, and in many cases execution turns out to be much harder than it appears(I once had one VP tell me he wanted to use HDFS for vmware storage, are you kidding me? At the same company the CTO wanted to entertain the idea of using FreeNAS for their high volume data processing TBs of data per day hundreds of megabytes of throughput per second for their mission critical data, the question was so absurd I didn’t know how to respond at the time).

I re-read what I wrote in the original container hype article many times(as I always re-read many times and make corrections). I realized pretty quickly that the person who wrote the original pro-docker container article I was quoting really seemed to me like a young developer who lacked experience working on anything other than really toy applications. One of the system administrators I know outright said at one point he just stopped reading that (pro-docker) article because the arguments were just absurd. But those points did seem to me to be along the lines of what I have been hearing for the past year so I believed it was a well formed post that I could leverage to respond to.

September 18, 2013

RIP Blackberry – Android is the Windows of the mobile world

Filed under: General,linux,Random Thought — Tags: , , , — Nate @ 4:32 pm

You can certainly count me as in the camp of folks that believed RIM/Blackberry had a chance to come back. However more recently I no longer feel this is possible.

While the news today of Blackberry possibly cutting upwards of 40% of their staff before the end of the year, is not the reason I don’t think it is possible, it just gave me an excuse to write about something..

The problem stems mainly from the incredibly fast paced maturation (can’t believe I just used that word) of the smart phone industry especially in the past three years. There was an opportunity for the likes of Blackberry, WebOS, and even Windows Phone to participate but they were not in the right place at the right time.

I can speak most accurately about WebOS so I’ll cover a bit on that. WebOS had tons of cool concepts and ideas, but they lacked the resources to put together a fully solid product – it was always a work in progress (fix coming next version). I felt even before HP bought them (and the feeling has never gone away even in the days of HP’s big product announcements etc) – that every day that went by WebOS fell further and further behind(obviously some of WebOS’ key technologies took years for the competition to copy, go outside that narrow niche of cool stuff and it’s pretty deserted). As much as I wanted to believe they had a chance in hell of catching up again (throw enough money at anything and you can do it) – there just wasn’t (and isn’t) anyone willing to commit to that level – and it makes sense too – I mean really the last major player left willing to commit to that level is Microsoft – their business is software and operating systems.

Though even before WebOS was released Palm was obviously a mess when they went through their various spin offs, splitting the company divisions up, licensing things around etc. They floundered without a workable (new) operating system for many years. Myself I did not become a customer of Palm until I puchased a Pre back in 2009. So don’t look at me as some Palm die hard because I was not. I did own a few Handspring Visors a long time ago and the PalmOS compatibility layer that was available as an App on the Pre is what drove me to the Pre to begin with.

So onto a bit of RIM. I briefly used a Blackberry back in 2006-2008 – I forget the model it was a strange sort of color device, I want to say monochrome-like color(I think this was it). It was great for email. I used it for a bit of basic web browsing but that was it – didn’t use it as a phone ever. I don’t have personal experience supporting BIS/BES or whatever it’s called but have read/heard almost universal hatred for those systems over the years. RIM obviously sat on their hands too long and the market got away from them. They tried to come up with something great with QNX and BB10 but the market has spoken – it’s not great enough to stem the tide of switchers, or to bring (enough) customers back to make a difference.

Windows Phone..or is it Windows Mobile.. Pocket PC anyone? Microsoft has been in the mobile game for a really long time obviously (it annoys me that press reporters often don’t realize exactly how long Microsoft has been doing mobile — and tablets for – not that they were good products but they have been in the market). They kept re-inventing themselves and breaking backwards compatibility every time. Even after all that effort – what do they have to show for themselves? ~3.5% global market share? Isn’t that about what Apple Mac has ? (maybe Mac is a bit higher).

The mobile problem is compounded further though. At least with PCs there are (and have been for a long time) standards. Things were open & compatible. You can take a computer from HP or from Dell or from some local whitebox company and they’ll all be able to run pretty much the same stuff, and even have a lot of similar components.

Mobile is different though, with ARM SoCs while having a common ancestor in the ARM instruction sets really seem to be quite a bit different enough that it makes compatibility a real issue between platforms. Add on top of that the disaster of the lack of a stable Linux driver ABI which complicates things for developers even more (this is in large part why I believe I read FirefoxOS and/or Ubuntu phone run on top of Android’s kernel/drivers).

All of that just means the barrier to entry is really high even at the most basic level of a handset. This obviously wasn’t the case with the standardized form factor components(and software) of the PC era.

So with regards to the maturation of the market the signs are clear now – with Apple and Samsung having absolutely dominated the revenues and profits in the mobile handset space for years now – both players have shown for probably the past year to 18 months that growth is really levelling out.

With no other players showing even the slightest hint of competition against these behemoths with that levelling of growth that tells me, sadly enough that the opportunity for the most part is gone now. The market is becoming a commodity certainly faster than I thought would happen and I think many others feel the same way.

I don’t believe Blackberry – or Nokia for that matter would of been very successful as Android OEMs.  Certainly at least not at the scale that they were at – perhaps with drastically reduced workforces they could of gotten by with a very small market share – but they would of been a shadow of their former selves regardless. Both companies made big bets going it alone and I admire them for trying – though neither worked out in the end.

Samsung may even go out as well the likes of Xiaomi (never heard of them till last week) or perhaps Huawei or Lenovo coming in and butchering margins below where anyone can make money on the hardware front.

What really prompted this line of thinking though was re-watching the movie Pirates of Silicon Valley a couple of weeks ago following the release of that movie about Steve Jobs. I watched Pirates a long time ago but hadn’t seen it since, this quote from the end of the movie really sticks with me when it comes to the whole mobile space:

Jobs, fresh from the launch of the Macintosh, is pitching a fit after realizing that Microsoft’s new Windows software utilizes his stolen interface and ideas. As Gates retreats from Jobs’ tantrum, Jobs screeches, “We have better stuff!”

Gates, turning, simply responds, “You don’t get it. That doesn’t matter.”

(the whole concepts really gives me the chills to think about, really)

Android is the Windows of the mobile generation (just look at the rash of security-related news events reported about Android..). Ironically enough the more successful Android is the more licensing revenue Microsoft gets from it.

I suppose in part I should feel happy being that it is based on top of Linux – but for some reason I am not.

I suppose I should feel happy that Microsoft is stuck at 3-4% market share despite all of the efforts of the world’s largest software company. But for some reason I am not.

I don’t know if it’s because of Google and their data gathering stuff, or if it’s because I didn’t want to see any one platform dominate as much as Android (and previously IOS) was.

I suppose there is a shimmer of hope in the incorporation of the Cyanogen folks to become a more formalized alternative to the Android that comes out of Google.

All that said I do plan to buy a Samsung Galaxy Note 3 soon as mentioned before. I’ve severed the attachment I had to WebOS and am ready to move on.

August 17, 2013

Happy Birthday Debian: 20 years old

Filed under: linux — Tags: — Nate @ 4:10 pm
Debian Powered

Techopsguys is Debian Powered

The big 2-0. Debian was the 2nd Linux I cut my teeth on, the first being Slackware 3.x. I switched to Debian 2.0 (hamm) in 1998 when it first came out. This was before apt existed (I think that was Debian 2.2 but not sure). I still remember the torture that was dselect, and much to my own horror dselect apparently still lives. Though I had to apt-get install it. It was torture because I literally spent 4-6 hours going through the packages selecting them one at a time. There may of been an easier way to do it back then I’m not sure, I was still new to the system.

I have been with Debian ever since, hard to believe it’s been about 15 years since I first installed it. I have, with only one exception stuck to stable the entire time. The exception I think was in between 2.2 and 3.0, I think that delay was quite large so I spent some time on the testing distribution. Unlike my early days running Linux I no longer care about the bleeding edge. Perhaps because the bleeding edge isn’t as important as it once was(to get basic functionality out of the system for example).

Debian has never failed me during a software update, or even major software upgrade. Some of the upgrades were painful (not Debian’s fault – for example going from Cyrus IMAP 1.x to 2.x was really painful). I do not have any systems that have lasted long enough to traverse more than one or two major system upgrades, hardware always gets retired. But unlike some other distributions major upgrades were fully supported and worked quite well.

I intentionally avoided Red Hat in my early days specifically because it was deemed easier to use. I started with Slackware, and then Debian. I spent hours compiling things whether it was X11, KDE 0.x, QT, GTK, Gnome, GIMP.. I built my own kernels from source, even with some custom patches(haven’t seriously done this since Linux 2.2). I learned a lot, I guess you could say the hard way. Which is why in part I do struggle on advising people who want to learn Linux what the best way is(books, training etc). I don’t know since I did it another way, a way that takes many years. Most people don’t have that kind of patience. At the time of course I really didn’t realize those skills would become so valuable later in life it was more of a personal challenge for myself I suppose.

I have used a few variants/forks of Debian over the years, most recently of course being Ubuntu. I have used Ubuntu exclusively on my laptops going back several years(perhaps even to 2006 I don’t remember). I have supported Ubuntu in server environments for the past roughly three years. I mainly chose Ubuntu for the laptops and desktops for the obvious reason – hardware compatibility. Debian (stable) of course tends to lag behind hardware support. Though these days I’m still happy running Ubuntu 10.04 LTS desktop .. which is EOL now. Haven’t decided what my next move is, not really thinking about it since what I have works fine still. Probably think more whenever I get my next hardware refresh.

I also briefly used Corel Linux, of which I still have the inflatable Corel penguin sitting on my desk at work it has followed me to every job for the past 13 years, still keeps it’s air. I don’t know why I have kept it for so long. Corel Linux was interesting in that they ported some of their own windows apps over to Linux with Wine, their office suite and some graphics programs. They made a custom KDE file manager if I recall right(with built in CIFS/SMB support if I recall right). Other than that it wasn’t much to write home about. Like most things on Linux the desktop apps were very fragile, obviously closed source and so did not last long(compatibility wise could not run them on other systems) after Corel Linux folded. My early Debian systems that I used as desktops at least got butchered by me installing custom stuff on top of them. Linux works best when you stick with the OS packages, and that’s something I did not do in the early days. These days I go to semi extreme lengths to make sure everything (within my abilities) is packaged in a Debian package before installation.

I used to participate a lot in the debian-user mailing list eons ago, though haven’t since due to lack of time. At the time at least that list had massive volume, it was just insane the amount of email I got from it. Looking now, comparing August 2013 roughly 1,300 messages, vs August 2001 almost 6,000! Even more so the spam I got long after I unsubscribed. It persisted for years until I terminated the email address associated with that list. I credit one job offer a bit over ten years ago now to my participation on that(and other) mailing lists at the time, as I specifically called them out in my references.

That being said, despite my devotion to Debian on my home systems (servers at least, this blog runs on Debian 7), I still do prefer Red Hat for commercial/larger scale stuff. Even with the past three years supporting Ubuntu the experience has been ok, I still like RH more. At the same time I do not like RH for my own personal use. It basically comes down to how the system is managed. I was going to go into reasons why I like RH more for this or that, but decided not to since it is off topic for this post.

I’ve never seen Toy Story – the movie characters Debian has used to name it’s releases after since at least 2.0 perhaps longer. Not really my kind of flick, have no intention of ever seeing it really.

Here’s a really old screen shot from my system back in the day. I don’t remember if this is Slackware or Debian, the kernel being compiled 2.1.121 came out in September 1998, so right about the time I made the switch. Looks like I am compiling Gimp 1.01, some version of XFree86, and downloading a KDE snapshot (I think all of that was pre 1.0 KDE). And look, xfishtank in the background! I miss that. These days Gnome and KDE take over the root window making things like xfishtank not visible when using them (last I tried at least). xpenguins is another cool one that does still work with GNOME.

REALLY Old Screenshot

So, happy 20th birthday Debian, it has been interesting to watch you grow up, and it’s nice to see your still going strong.

October 23, 2012

Should System admins know how to code?

Filed under: linux — Tags: — Nate @ 11:57 am

Just read the source article, and the discussion on slashdot was far more interesting.

It’s been somewhat of a delicate topic for myself, having been a system admin of sorts for about sixteen years now, primarily on the Linux platform.

For me, more than anything else, you have to define what code is. Long ago I drew a line in the sand that I have no interest in being a software developer, I do plenty of scripting in Perl & Bash, primarily for monitoring purposes and to aid in some of the more basic areas of running systems.

Since this blog covers 3PAR I suppose I should start there – I’ve written scripts to do snapshots and integrate them with MySQL (still in use today) and Oracle (haven’t used this side of things since 2008).  This is a couple thousand lines of script (I don’t like to use the word code because to me it implies some sort of formal application). I’d wager 99% of that is to support the Linux end of things and 1% to support 3PAR. One company I was at I left, and turned these scripts over to people who were going to try to take on my responsibility. The folks had minimal scripting experience and their eyes glazed over pretty quick while I walked them through the process. They feared the 1,000 line script. Even though for the most part the system was very reliable and not difficult to recover from failures from, even if you had no scripting experience. In this case to manage snapshots with MySQL (integrated with a storage platform) – I’m not aware of any out of the box tool that can handle this. So you sort of have no choice but to glue your own together. With Oracle, and MSSQL tools are common, maybe even DB2 – but MySQL is left out in the cold.

I wrote my own perl-based tool to login to 3PAR arrays and get their metrics and populate RRD files (I use cacti to present that data – since it has a nice UI, but cacti could not collect data like I can so that stuff is run outside of cacti). Another thousand lines of script here.

Perhaps one of the coolest things I think I wrote was a file distribution system a few years ago to replace a product we used in house that was called R1 Repliweb. Though it looks like they got acquired by somebody else. Repliweb is a fancy file distribution system that primarily ran on Windows, but the company I was at was using the Linux agents to pass files around. I suppose I could write a full ~1200 word post about that project alone(if your interested in hearing that let me know), but basically I replaced it with an architecture of load balancers, VMs, a custom version of SSH, rsync, with some help from CFengine and about 200 lines of script which not only dramatically improved scalability but also reliability went literally to 100%. Never had a single failure (the system was self healing – though I did have to turn off rsync’s auto resume feature because it didn’t work for this project) while I was there (the system was in place about 12-16 months when I left).

So back to the point – to code or not to code. I say not to code (again back to what code means – in my context it means programming – if your directly using APIs then your programming, if your using tools to talk to APIs then your scripting) – for the most part at least. Don’t make things too complicated. I’ve worked with a lot of system admins over the years and the number that can script well, or code is very small. I don’t see that number increasing. Network engineers are even worse – I’ve never seen a network engineer do anything other than completely manually. I think storage is similar.

If you start coding your infrastructure you start making it even more difficult to bring new people on board, to maintain this stuff, and run it moving forward. If you happen to be in an environment that is experiencing explosive growth and your adding dozens or hundreds of servers constantly then yes this can make a lot of sense. But most companies aren’t like that and never will be.

It’s hard enough to hire people these days, if you go about raising the bar to even higher levels your never going to find anyone. I think to the Hadoop end of the market – those folks are always struggling to hire because the skill is so specialized, and there are so few people out there that can do it. Most companies can’t compete with the likes of Microsoft, Yahoo and other big orgs with their compensation and benefits packages.

You will, no doubt spend more on things like software, hardware for things that some fancy DevOps god could do in 10 lines of ruby while they sleep. Good luck finding and retaining such a person though, and if you feel you need redundancy so someone can take a real vacation, yeah that’s gonna be tough. There is a lot more risk, in my opinion in having a lot of code running things if you lack the resources to properly maintain it.  This is a problem even at scale as well. I’ve heard on several occasions – the big Amazon themselves, customized CFengine v1 way back when with so much extra stuff. Then v2 (and since v3)  came around with all sorts of new things, and guess what – Amazon couldn’t upgrade because they had customized it too much. I’ve heard similar things about other technologies Amazon has adopted. They are stuck because they customized it too much and can’t upgrade.

I’ve talked to a ton of system admin candidates over the past year and the number that I feel comfortable being able to take over the “code” on our end I think is fair to say is zero. Granted not even I can handle the excellent code written by my co-worker. I like to tell people I can do simple stuff in 10 minutes on CFengine and it will take me four hours to do things the chef way on chef, my eyes will bleed and my blood will boil in the process.

The method I’d use on CFengine you could say “sucks” compared to Chef, but it works, and is far easier to manage. I can bring  almost anyone up to speed on the system in a matter of hours, vs chef takes a strong Ruby background to use (myself I am going on nearly two and a half years with Chef and I haven’t made much progress other than I feel I can speak with authority on how complex it is).

Sure it can be nice to have APIs for everything, fancy automation everywhere – but you need to pick your battles.  When your dealing with a cloud organization like Amazon you almost have to code – to deal with all of their faults and failures and just overall stupid broken designs and everything that goes along with it. Learning to code makes the experience most likely from absolutely infuriating (where I stand) to almost manageable (costs and architecture aside here).

When your dealing with your own stuff, where you don’t have to worry about IPs changing at random because some host has died, or because you can change your CPU or memory configuration with a few mouse clicks and not have to re-build your system from scratch, the amount of code you need shrinks dramatically, lowering the barriers to entry.

After having worked in the Amazon cloud for more than two years both myself and my co-workers(who have much more experience in it than me) believe that it actually takes more effort and expertise to properly operate something in there vs doing it on your own. It’s the total opposite of how cloud is viewed by management.

Obviously it is easier said than done, just look at the sheer number of companies that go down every time Amazon has an outage or their service is degraded. Most recent one was yesterday. It’s easy for some to blame the customer for not doing the right thing,  at the end of the day though most companies would rather work on the next feature to attract customers and let something else handle fault tolerance. Only the most massive companies have resources to devote to true “web scale” operation. Shoe horning such concepts onto small and medium businesses is just stupid, and the wrong set of priorities.

Someone made a comment recently that made me laugh (not at them, but more at the situation). They said they performed some task to make my life easier in the event we need to rebuild a server (a common occurrence in EC2). I couldn’t help but laugh because we hadn’t rebuilt a single server since we left EC2 (coming up on one year in a few months here).

I think it’s great that equipment manufacturers are making their devices more open, more programmatic. Adding APIs, and other things to make automation easier. I think it’s primarily great because then someone else can come up with the glue that can tie it all together.

I don’t believe system admins should have to interact with such interfaces directly.

At the same time I don’t expect developers to understand operations in depth. Hopefully they have enough experience to be able to handle basic concepts like load balancing(e.g. store session data in some central place, preferably not a traditional SQL database). The whole world often changes from running an application in a development environment to running it in production. The developers take their experience to write the best code that they can, and the systems folks manage the infrastructure (whether it is cloud based or home grown) and operate it in the best way possible.  Whether that means separating out configuration files so people can’t easily see passwords, to inserting load balancers in between tiers, splitting out how application code is deployed,  to something as simple as log rotation scripts.

If you were to look at my scripts you may laugh(depending on your skill level) – I try to keep them clean but they are certainly not up to programmer standards, no I’ve never “use strict” on Perl for example. My scripting is simple so to do things sometimes takes me many more lines than someone more experienced in the trade to do. This has it’s benefits though – it makes it easier for more people to be able to follow the logic should they need to, and it still gets the job done.

The original article seemed to focus more on scripts, while the discussion on slashdot at some points really got into programming with one person saying they wrote apache modules ?!

As one person in the discussion thread on slashdot pointed out, heavy automation can hurt just as much as help. One mistake in the wrong place and you could take the systems down far faster than you can recover them. This has happened to me on more than one occasion of course.  One time in particular I was looking at a CFEngine configuration file, saw some logic that appeared to be obsolete, and removed a single character (a ! which told CFEngine don’t apply that configuration to that class), then CFengine went and wiped out my apache configurations. When I made the change I was very sure that what I was doing was right, but in the end it wasn’t. That happened seven years ago but I still remember it like it was yesterday.

System administrators should not have to program – scripting certainly is handy and I believe that is important(not critical – it’s not at the top of my list  for skills when hiring), just keep an eye out for complexity and supportability when your doing that stuff.

October 15, 2012

Ubuntu 10.04 LTS upgrade bug causes issues

Filed under: linux — Tags: , — Nate @ 11:41 am

[UPDATE] – after further testing it seems it is machine specific I guess my ram is going bad. dag nabbit.

 

I’ve been using Ubuntu for about five years, and Debian I have been using since 1998.

This is a first for me. I came into the office and Ubuntu was prompting to upgrade some packages, I run 10.04 LTS – which is the stable build.  I said go for it, and it tried, and failed.

I tried again, and failed, and again and failed.

I went to the CLI and it failed there too – dpkg/apt was seg faulting –

[1639992.836460] dpkg[31986]: segfault at 500006865496 ip 000000000040b7bf sp 00007fff71efdee0 error 4 in dpkg[400000+65000]
[1640092.698567] dpkg[32069] general protection ip:40b7bf sp:7fff73b2f750 error:0 in dpkg[400000+65000]
[1640115.056520] dpkg[32168]: segfault at 500008599cb2 ip 000000000040b7bf sp 00007fff20fc2da0 error 4 in dpkg[400000+65000]
[1640129.103487] dpkg[32191] general protection ip:40b7bf sp:7fffd940d700 error:0 in dpkg[400000+65000]
[1640172.356934] dpkg[32230] general protection ip:40b7bf sp:7fffbb361e80 error:0 in dpkg[400000+65000]
[1640466.594296] dpkg-preconfigu[32356]: segfault at d012 ip 00000000080693e4 sp 00000000ff9d1930 error 4 in perl[8048000+12c000]
[1640474.724925] apt-get[32374] general protection ip:406a67 sp:7fffea1e6c68 error:0 in apt-get[400000+1d000]
[1640920.178714] frontend[720]: segfault at 4110 ip 00000000080c50b0 sp 00000000ffa52ab0 error 4 in perl[8048000+12c000]

I have a 32-bit chroot to run things like 32-bit firefox, and I had the same problem there. For a moment I thought maybe I have bad ram or something, but turns out that was not the case. There is some sort of bug in the latest apt 0.7.25.3ubuntu9.14 (I did not see a report on it, though the UI for Ubuntu bugs seems more complicated than Debian’s bug system), which causes this. I was able to get around this by:

  • Manually downloading the older apt package (0.7.25.3ubuntu9.13)
  • Installing the package via dpkg (dpkg -i <package>)
  • Exporting the list of packages (dpkg –get-selections >selections)
  • Edit the list, change apt from install to hold
  • Import the list of packages (dpkg –set-selections <selections)
  • apt-get works fine now on 64-bit

However my 32-bit chroot has a hosed package status file, something else that has never happened to me in the past 14 years on Debian. So I will have to figure out how to correct that, or worst case I suppose wipe out the chroot and reinstall it, since it is a chroot, it’s not a huge deal. Fortunately the corruption didn’t hit the 64-bit status file. There is a backed up status file but it was corrupt too (I think because I tried to run apt-get twice).

64-bit status file:

/var/lib/dpkg/status: UTF-8 Unicode English text, with very long lines

32-bit status file:

/var/lib/dpkg/status: data

I’m pretty surprised, that this bug(s) got through. Not the quality I’ve come to know and love ..

September 2, 2012

Some reasons why Linux on the desktop failed

Filed under: linux — Tags: — Nate @ 10:40 pm

The most recent incarnation of this debate seemed to start with a somewhat interesting article over at Wired who talked to Miguel de Icaza who is a pretty famous desktop developer in Linux, mostly famous for taking what seemed to be controversial stances on implementing Microsoft .NET on Linux in the form of Mono.

And he thinks the real reason Linux lost is that developers started defecting to OS X because the developers behind the toolkits used to build graphical Linux applications didn’t do a good enough job ensuring backward compatibility between different versions of their APIs. “For many years, we broke people’s code,” he says. “OS X did a much better job of ensuring backward compatibility.”

It has since blown up a bit more with lots more people giving their two cents. As a Linux desktop user (who is not a developer) for the past roughly 14 years I think I can speak with some authority based on my own experience. As I think back, I really can’t think of anyone I know personally who has run Linux on the desktop for as long as I have, or more to the point hasn’t tried it and given up on it after not much time had passed – for the most part I can understand why.

For the longest time Linux advocates hoped(myself included) Linux could establish a foot hold as something that was good enough for basic computing tasks, whether it’s web browsing, checking email, basic document writing etc. There are a lot of tools and toys on Linux desktops, most seem to have less function than form(?) at least compared to their commercial counterparts. The iPad took this market opportunity away from Linux — though even without iPad there was no signs that Linux was on the verge of being able to capitalize on that market.

Miguel’s main argument seems to be around backwards compatibility, an argument I raised somewhat recently, backwards compatibility has been really the bane for Linux on the desktop, and for me at least it has been just as much to do with the kernel and other user space stuff as it does the various desktop environments.

Linux on the desktop can work fine if:

  • Your hardware is well supported by your distribution – this will stop you before you get very far at all
  • You can live within the confines of the distribution – if you have any needs that aren’t provided as part of the stock system you are probably in for a world of hurt.

Distributions like Ubuntu, and SuSE before it (honestly not sure what, if anything has replaced Ubuntu today) have made tremendous strides in improving Linux usability from a desktop perspective. Live CDs have helped a lot too, to be able to give the system a test run without ever installing it to your HD is nice.

I suspect most people today don’t remember the days when the installer was entirely text based and you had to fight with XFree86 to figure out the right mode lines for your monitor for X11 to work, and fortunately I don’t think really anyone uses dial up modems anymore so the problems we had back when modems went almost entirely to software in the form of winmodems are no longer an issue. For a while I forked out the cash for Accelerated X, a commercial X11 server that had nice tools and was easy to configure.

The creation of the Common Unix Printer System or CUPS was also a great innovation, printing on Linux before that was honestly almost futile with basic printers, I can’t imagine what it would of been like with more complex printers.

Start at the beginning though – the kernel. The kernel is not, and never really has maintained a stable binary interface for drivers over the years. To the point where I can take a generic driver for say a 2.6.x series kernel and use the same driver (singular driver) on Ubuntu, Red Hat, Gentoo or whatever. I mean you don’t have to look further than how many binary kernel drivers VMware includes with their vmware tools package to see how bad this is, on the version of vmware tools I have on the server that runs this blog there are 197 — yes 197 different kernels supported in there

  • 47 for Ubuntu
  • 55 For Red Hat Enterprise
  • 57 for SuSE Linux Enterprise
  • 39 for various other kernels

In an ideal world I would expect maybe 10 kernels for everything, including kernels that are 64 vs 32 bit.

If none of those kernels work, then yes, vmware does include the source for the drivers and you can build it yourself(provided you have the right development packages installed the process is very easy and fast). But watch out, the next time you upgrade your kernel you may have to repeat the process.

I’ve read in the most recent slashdot discussion where the likes of Alan Cox (haven’t heard his name in years!) said the Linux kernel did have a stable interface as he can run the same code from 1992 on his current system. My response to that is..then why do we have all these issues with drivers.

One of the things that has improved the state of Linux drivers is virtualization – it slashes the amount of driver code needed by probably 99% running the same virtual hardware regardless of the underlying physical hardware. It’s really been nice not to have to fight hardware compatibility recently as a result of this.

There have been times where device makers have released driver disks for Linux, usually for Red Hat based systems, however these often become obsolete fairly quickly. For some things perhaps like video drivers it’s not the end of the world, for the experienced user anyways you still have the ability to install a system and get online and get new drivers.

But if the driver that’s missing is for the storage controller, or perhaps the network card things get more painful.

I’m not trying to complain, I have dealt with these issues for many years and it hasn’t driven me away — but I can totally see how it would drive others away very quickly, and it’s too bad that the folks making the software haven’t put more of an effort into solving this problem.

The answer is usually make it open source – in the form of drivers at least, if the piece of software is widely used then making it open source may be a solution, but I’ve seen time and time again source released and it just rots on the vine because nobody has interest in messing with it (can’t blame them if they don’t need it). If the interface was really stable the driver could probably go unmaintained for several years without needing anyone to look at it (at least through the life of say the 2.6.x kernel)

When it comes to drivers and stuff – for the most part they won’t be released as open source, so don’t get your hopes up. I saw one person say that their company didn’t want to release open source drivers because they feared that they might be in violation of someone else’s patents and releasing the source would make it easier for their competition to be able to determine this.

The kernel driver situation is so bad in my opinion that distributions for the most part don’t back port drivers into their previous releases. Take Ubuntu for example, I run 10.04 LTS on my laptop that I am using now as well as my desktop at work.  I can totally understand if the original released version doesn’t have the latest e1000e (which is totally open source!) driver for my network card at work. But I do not understand that more than a year after it’s release it still doesn’t have this driver. Instead you either have to manage the driver yourself (which I do – nothing new for me), or run a newer version of the distribution (all that for one simple network driver?!). This version of the distribution is supported until April 2013. Please note I am not complaining, I deal with the situation – I’m just stating a fact. This isn’t limited to Ubuntu either, it has applied to every Linux distribution I’ve ever used for the most part. I saw Ubuntu recently updated Skype to the latest version for Linux recently on 10.04 LTS, but they still haven’t budged on that driver (no I haven’t filed a bug/support request, I don’t care enough to do it – I’m simply illustrating a problem that is caused by the lack of a good driver interface in the kernel  – I’m sure this applies to FAR more than just my little e1000e).

People rail on the manufacturers for not releasing source, or not releasing specs. This apparently was pretty common back in the 70s and early 80s. It hasn’t been common in my experience since I have been using computers (going back to about 1990). As more and more things are moved from hardware to software I’m not surprised that companies want to protect this by not releasing source/specs. Many manufacturers have shown they want to support Linux, but if you force them to do so by making them build a hundred different kernel modules for the various systems they aren’t going to put the effort in to doing it. Need to lower the barrier of entry to get more support.

I can understand where the developers are coming from though, they don’t have incentive to make the interfaces backwards and forwards compatible since that does involve quite a bit more work(much of it boring), instead prefer to just break things as the software evolves. I had been hoping that as the systems matured more this would become less common place, but it seems that hasn’t been the case.

So I don’t blame the developers…

But I also don’t blame people for not using Linux on the desktop.

Linux would of come quite a bit further if there was a common way to install drivers for everything from network cards to storage controllers, to printers, video cards, whatever, and have these drivers work across kernel versions, even across minor distribution upgrades. This has never been the case though (I also don’t see anything on the horizon, I don’t see this changing in the next 5 years if it changes ever).

The other issue with Linux is working within the confines of the distribution, this is similar to the kernel driver problem – because different distros are almost always more than a distribution with the same software, the underlying libraries are often incompatible between distributions so a binary on one, especially a complex one that is say a KDE or Gnome application won’t work on another, there are exceptions like Firefox, Chrome etc – though other than perhaps static linking in some cases not sure what they do that other folks can’t do. So the amount of work to support Linux from a desktop perspective is really high. I’ve never minded static linking, to me it’s a small price to pay to improve compatibility with the current situation. Sure you may end up loading multiple copies of the libraries into memory(maybe you haven’t heard but it’s not uncommon to get 4-8GB on a computer these days), sure if there is a security update you have to update the applications that have these older libraries as well. It sucks I suppose, but from my perspective it sucks a lot less than what we have now.  Servers – are an entirely different beast, run by(hopefully) experienced people who can handle this situation better.

BSD folks like to tout their compatibility – though I don’t think that is a fair comparison, comparing two different versions of FreeBSD against Red Hat vs Debian are not fair, comparing two different versions of Red Hat against each other with two different versions of FreeBSD (or NetBSD or OpenBSD or DragonFly BSD …etc)  is more fair. I haven’t tried BSD on the desktop since FreeBSD 4.x, for various reasons it did not give me any reasons to continue using it as a desktop and I haven’t had any reasons to consider it since.

I do like Linux on my desktop, I ran early versions of KDE (pre 1.0), up until around KDE 2, then switched to AfterStep for a while, eventually switching over to Gnome with Ubuntu 7 or 8, I forget. With the addition of an application called Brightside, GNOME 2.x works really well for me. Though for whatever reason I have to launch Brightside manually each time I login, setting it to run automatically on login results in it not working.

I also do like Linux on my servers, I haven’t compiled a kernel from scratch since the 2.2 days, but have been quite comfortable working with the issues operating Linux on the server end, the biggest headaches were always drivers with new hardware, though thankfully with virtualization things are much better now.

The most recent issue I’ve had with Linux on servers has been some combination of Ubuntu 10.04, with LVM and ext4 along with enterprise storage. Under heavy I/O have have seen many times ext4 come to a grinding halt. I have read that Red Hat explicitly requires that barriers be disabled with ext4 on enterprise storage, though that hasn’t helped me. My only working solution has been to switch back to ext3 (which for me is not an issue). The symtoms are very high system cpu usage, little to no i/o (really any attempts to do I/O result in the attempt freezing up), and when I turn on kernel debugging it seems the system is flooded with ext4 messages. Nothing short of a complete power cycle can recover the system in that state. Fortunately all of my root volumes are ext3 so it doesn’t prevent someone from logging in and poking around. I’ve looked high and low and have not found any answers. I had never seen this issue on ext3, and the past 9 months has been the first time I have run ext4 on enterprise storage. Maybe a bug specific to Ubuntu, am not sure. LVM is vital when maximizing utilization using thin provisioning in my experience, so I’m not about to stop using LVM, as much as 3PAR’s marketing material may say you can get rid of your volume managers – don’t.

 

June 30, 2012

Synchronized Reboot of the Internet

Filed under: linux — Tags: — Nate @ 7:37 pm

[UPDATE – I’ve been noticing some people claim that kernels newer than 2.6.29 are not affected, well I got news for you, I have 200+ VMs that run 2.6.32 that say otherwise (one person in the comments mentions Kernel 3.2 is impacted too!) 🙂 ]

[ UPDATE 2 – this is a less invasive fix that my co-worker has tested on our systems:

date -s "`date -u`"

]
Been fighting a little fire that I’m sure hundreds if not thousands are fighting as well it happened at just before midnight UTC when a leap second was inserted into our systems, and well that seemed to trip a race condition in Linux, that I assume most thought was fixed but I guess people didn’t test it.

[3613992.610268] Clock: inserting leap second 23:59:60 UTC

 

The behavior as I’m sure your all aware of by now is a spike in CPU usage, normally our systems run on average under 8% cpu usage, and this really pegged them up by ten fold. Fortunately vSphere held up and we had the capacity to eat it, the resource pools helped make sure production had it’s share of CPU power. Only minimal impact to the customers, our external alerting never even went off, that was a good sign.

CPU Spike on a couple hundred VMs all at the same time (the above cluster has 441Ghz of CPU resources)

We were pretty lost at first, fortunately my co-worker had a thought maybe it was leap second related, we dug into things more and eventually came across this page (thanks google for being up to date), which confirmed the theory and confirmed we weren’t the only ones impacted by it.  Fortunately our systems were virtualized by a system that was not impacted by the issue so we did not experience any issues on the bare metal only in the VMs. From the page

Just today, Sat June 30th – starting soon after the start of the day GMT. We’ve had a handful of blades in different datacentres as managed by different teams all go dark – not responding to pings, screen blank.

They’re all running Debian Squeeze – with everything from stock kernel to custom 3.2.21 builds. Most are Dell M610 blades, but I’ve also just lost a Dell R510 and other departments have lost machines from other vendors too. There was also an older IBM x3550 which crashed and which I thought might be unrelated, but now I’m wondering.

It wasn’t long after that we started getting more confirmations of the issue from pretty much everyone out there. We haven’t dug into more of a root cause at this point we’ve been busy rebooting Linux VMs which seems to be a good workaround (didn’t need the steps indicated on the page). Even our systems that are up to date with kernel patches and stuff as recently as a month ago were impacted. Red Hat apparently is issuing a new advisory for their systems since they were impacted as well.

Some systems behaved well under the high load, others were so unresponsive they had to be power cycled. There was usually one process that was chewing through an abnormal amount of CPU, for the systems I saw it was mostly Splunk and autofs.  I think it was just coincidence though, perhaps processes that were using CPU at the instant the leap second was inserted into the system.

The internet is in the midst of a massive reboot. I pity the foo who has a massive number of systems and has to co-ordinate some complex massive reboot (unless there is another way – for me reboot was simplest and fastest).

I for one was not aware that a leap second was coming or the potential implications, it’s obvious I’m not alone. I do recall leap seconds in the past not causing issues for any of the systems I managed. I logged into my personal systems including the one that powers this blog, and there are no issues on them. My laptop runs Ubuntu 10.04 as well(same OS rev as the servers I’ve been rebooting for the past 2 hours) and no issues there either (been using it all afternoon).

Maybe someday someone will explain to me in a way that makes sense why we give a crap about adding a second, I really don’t care if the world is out of sync by a few seconds with the rest of the known universe, if it’s that important we should have a scientific time or something, and let the rest of the normal folks go about their way. Same goes for daylight savings time. Imagine the power bill as a result of this fiasco, with 1000s, to 100,000s of servers spiking to 100% CPU usage all at the same time.

Microsoft will have a field day with this one I’m sure 🙂

 

April 5, 2012

Built a new computer – first time in 10 years

Filed under: linux,Random Thought — Tags: , — Nate @ 8:51 am

I was thinking about when the last time I built a computer from scratch this morning and I think it was about ten years ago, maybe longer – I remember the processor was a Pentium 3 800Mhz. It very well may of been almost 12 years ago. Up until around 2004 time frame I had built and re-built computers re-using older parts and some newer components, but as far as going out and buying everything from scratch, it was 10-12 years ago.

I had two of them, one was a socket-based system the other was a “Slot 2“-based system. I also built a couple systems around dual-slot (Slot 1) Pentium 2 systems with the Intel L440GX+ Motherboard (probably my favorite motherboard of all time). For those of you think that I use nothing but AMD I’ll remind you that aside from the AMD K6-3 series I was an Intel fanboy up until the Opteron 6100 was released. I especially liked the K6-3 for it’s on chip L2 cache, combined with 2 Megabytes of L3 cache on the motherboard it was quite zippy. I still have my K6-3 CPU itself in a drawer around here somewhere.

So I decided to build a new computer to move my file serving functions out of my HP xw9400 workstation which I bought about a year and a half ago into something smaller so I could turn the HP box into something less serious to play some games on my TV on (like WC: Saga!). Maybe get a better video card for it I’m not sure.

I have a 3Ware RAID card and 4x2TB disks in my HP box so I needed something that could take that. This is what I ended up going with, from Newegg –

Seemed like an OK combination, the case is pretty nice having a 5-port hot swap SATA backplane, supporting up to 7 HDDs. PC Power & Cooling (I used to swear by them and so thought might as well go with them again) had a calculator and said for as many HDDs as I had to get a 500W so I got that.

There is a lot of new things here that are new to me anyways and it’s been interesting to see how technology has changed since I last did this in the Pentium 3 era.

Mini ITX. Holy crap is that small. I knew it was small based on dimensions but it really didn’t set in until I held the motherboard box in my hand and it seemed about the same size as a retail box for a CPU 10 years ago. It’s no wonder the board uses laptop memory. The amount of integrated features on it are just insane as well from ethernet to USB 2 and USB 3, eSATA, HDMI, DVI, PS/2, optical audio output, analog audio out, and even wireless all crammed into  that tiny thing. Oh! Bluetooth is thrown in as well. During my quest to find a motherboard I even came across a motherboard that had a parallel port on it – I thought those died a while ago. The thing is just so tiny and packed.

On the subject of motherboards – the very advanced overclocking functions is just amazing. I will not overclock since I value stability over performance, and I really don’t need the performance in this box. I took the overclocking friendliness of this board to hopefully mean higher quality components and the ability to run more stable at stock speeds. Included tweaking features –

  • 64-step DRAM voltage control
  • Adjustable CPU voltage at 0.00625V increments (?!)
  • 64-step chipset voltage control
  • PCI Express frequency tuning from 100Mhz up to 150Mhz in 1Mhz increments
  • HT Tuning from 100Mhz to 550Mhz in 1Mhz increments
  • ASUS C.P.R. (CPU Parameter Recall) – no idea what that is
  • Option to unlock the 4th CPU core on my CPU
  • Options to run on only 1,2 or or all 3 cores.

Last time I recall over clocking stuff there was maybe 2-3 settings for voltage and the difference was typically at least 5-15% between them. I remember the only CPU I ever over clocked was a Pentium 200 MMX (o/c to 225Mhz – no voltage changes needed ran just fine).

I seem to recall from a PCI perspective, back in my day there was two settings for PCI frequencies, whatever the normal was, and one setting higher(which was something like 25-33% faster).

ASUS M4A88T-I Motherboard

Memory – wow it’s so cheap now, I mean 8GB for $45 ?! The last time I bought memory was for my HP workstation which requires registered ECC – and it was not so cheap ! This system doesn’t use ECC of course. Though given how dense memory has been getting and the possibility of memory errors only increasing I would think at some point soon we would want some form of ECC across the board ? It was certainly a concern 10 years ago when building servers with even say 1-2GB of memory now we have many desktops and laptops coming standard with 4GB+. Yet we don’t see ECC on the desktops and laptops – I know because of cost but my question is more around there doesn’t appear to be a significant (or perhaps in some cases even noticeable) hit in reliability of these systems with larger amounts of memory without ECC which is interesting.

Another thing I noticed was how massive some video cards have become, consuming as many as 3 PCI slots in some cases for their cooling systems. Back in my day the high end video cards didn’t even have heat sinks on them! I was a big fan of Number Nine back in my day and had both their Imagine 128 and Imagine 128 Series 2 cards, with a whole 4MB of memory (512kB chips if I remember right on the series 2, they used double the number of smaller chips to get more bandwidth). Those cards retailed for $699 at the time, a fair bit higher than today’s high end 3D cards (CAD/CAM workstation cards excluded in both cases).

Modular power supplies – the PSU I got was only partially modular but it was still neat to see.

I really dreaded the assembly of the system since it is so small, I knew the power supply was going to be an issue as someone on Newegg said that you really don’t want a PSU that is longer than 6″ because of how small the case is. I think PC Power & Cooling said mine was about 6.3″(with wiring harness). It was gonna be tight — and it was tight. I tried finding a shorter power supply in that class range but could not. It took a while to get the cables all wrapped up. My number one fear of course after doing all that work, hitting the power button and find out there’s a critical problem (bought the wrong ram, bad cpu, bad board, plugged in the power button the wrong way whatever).

I was very happy to see when I turned it on for the first time it lit up and the POST screen came right up on the TV. There was a bad noise comming from one of the fans because the cable was touching it, so I opened it up again and tweaked the cables more so they weren’t touching the fan, and off I went.

First, without any HDs just to make sure it turned on, the keyboard worked, I could get into the BIOS screen etc. All that worked fine, then I opened up the case again and installed an old 750GB HD in one of the hot swap slots. Hooked up a USB CDROM with a CD of Ubuntu 10.04 64-bit and installed it on the HD.

Since this board has built in wireless I was looking forward to trying it out – didn’t have much luck. It could see the 50 access points in the area but it was not able to login to mine for some reason, I later found that it was not getting a DHCP response so I hard wired an IP and it worked — but then other issues came up like DNS not working, very very slow transfer speeds(as in sub 100 BYTES per second), after troubleshooting for about 20 minutes I gave up and went wired and it was fast. I upgraded the system to the latest kernel and stuff but that didn’t help the wireless. Whatever, not a big deal I didn’t need it anyways.

I installed SSH, and logged into it from my laptop,  shut off X-Windows, and installed the Cerberus Test Suite (something else I used to swear by back in the mid 00s). Fortunately there is a packaged version of it for Ubuntu as, last I checked it hasn’t been maintained in about seven years. I do remember having problems compiling it on a 64-bit RHEL system a few years ago (Though 32-bit worked fine and the resulting binaries worked fine on 32-bit too).

Cerberus test suite (or ctcs as I call it), is basically a computer torture test. A very effective one, the most effective I’ve ever used myself. I found that if a computer can survive my custom test (which is pretty basic) for 6 hours then it’s good, I’ve run the tests as long as 72 hours and never saw a system fail in a period more than 6 hours. Normally it would be a few minutes to a couple hours. It would find problems with memory that memtest wouldn’t find after 24 hours of testing.

What cerberus doesn’t do, is it doesn’t tell you what failed or why, if your system just freezes up you still have to figure it out. On one project I worked on that had a lot of “white box” servers in it, we deployed them about a rack at a time, and I would run this test, maybe 85% of them would pass, and the others had some problem, so I told the vendor go fix it, I don’t know what it is but these are not behaving like the others so I know there is a issue. Let them figure out what component is bad (90% of the time it was memory).

So I fired up ctcs last night, and watched it for a few minutes, wondering if there is enough cooling on the box to keep it from bursting into flames. To my delight it ran great, with the CPU topping out at around 54C (honestly have no idea if that is good or not, I think it is OK though). I ran it for 6 hours overnight and no issues when I got up this morning. I fired it up again for another 8 hours (the test automatically terminates after a pre defined interval).

I’m not testing the HD, because it’s just a temporary disk until I move my 3ware stuff over.  I’m mainly concerned about the memory, and CPU/MB/cooling. The box is still running silent (I have other stuff in my room so I’m sure it makes some noise but I can’t hear it). It has 4 fans in it including the CPU fan. A 140mm, a 120mm and the PSU fan which I am not sure how big that is.

My last memory of ASUS was running on an Athlon with an A7A-266 motherboard(I think in 2000), that combination didn’t last long. The IDE controller on the thing corrupted data like nobody’s business. I would install an OS, and everything would seem fine then the initial reboot kicked in and everything was corrupt. I still felt that ASUS was a pretty decent brand maybe that was just specific to that board or something. I’m so out of touch with PC hardware at this level the different CPU sockets,  CPU types, I remember knowing everything backwards and forwards in the Socket 7 days, back when things were quite interchangeable. Then there was my horrible year or two experience in the ABIT BP-6, a somewhat experimental dual socket Celeron system. What a mistake that was, oh what headaches that thing gave me. I think I remember getting it based on a recommendation at Tom’s Hardware guide, a site I used to think had good information (maybe it does now I don’t know). But that experience with the BP6 fed back into my thoughts about Tom’s hardware and I really didn’t go back to that site ever again(sometimes these days I stumble upon it on accident). I noticed a few minutes ago that Abit as a company is out of business now, they seemed to be quite the innovator back in the late 90s.

Maybe this weekend I will move my 3ware stuff over and install Debian (not Ubuntu) on the new system and set it up. While I like Red Hat/CentOS for work stuff, I like Debian for home. It basically comes down to if I am managing it by hand I want Debian, if I’m using tools like CFEngine to manage it I want RH. If it’s a laptop, or desktop then it gets Ubuntu 10.04 (I haven’t seen the nastiness in the newer Ubuntu release(s) so not sure what I will do after 10.04).

I really didn’t think I’d ever build a computer again, until this little side project came up.

Another reason I hate SELinux

Filed under: linux,Random Thought — Tags: , , — Nate @ 7:43 am

I don’t write too much about Linux either but this is sort of technical I guess.

I’ve never been a fan of SELinux. I’m sure it’s great if your in the NSA, or the FBI, or some other 3 letter agency, but for most of the rest of the people it’s a needless pain to deal with, and provides little benefit.

I remember many moons ago back when I dealt with NT4, encountering situations where I, as an administrator could not access a file on the NTFS file system. It made no sense – I am administrator – get me access to that file – but no, I could not get access. HOWEVER, I could change the security settings and take ownership of the file NOW I can get access. Since I have that right to begin with it should just give me access and not make me jump through those hoops. That’s what I think at least. I recall someone telling me back in the 90s that Netware was similar and even went to further extremes where you could lock the admin out of files entirely, and in order to back data up you had another backup user which the backup program used and that was somehow protected too. I can certainly understand the use case, but it certainly makes things frustrating. I’ve never been at a company that needed anywhere remotely that level of control (I go out of my way to avoid them actually since I’m sure that’s only a small part of the frustrations of being there).

On the same token I have never used (for more than a few minutes anyways) file system ACLs on Linux/Unix platforms either. I really like the basic permissions system it works for 99.9% of my own use cases over the years, and is very simple to manage.

I had a more recent experience that was similar, but even more frustrating on Windows 7. I wanted to copy a couple files into the system32 directory, but no matter what I did (including take ownership, change permissions etc) it would not let me do it. It’s my #$#@ computer you piece of #@$#@.

Such frustration is not limited to Windows however, Linux has it’s own similar functionality called SE Linux, which by default is turned on in many situations. I turn it off everywhere, so when I encounter it I am not expecting it to be on, and the resulting frustration is annoying to say the least.

A couple weeks ago I installed a test MySQL server, and exposed a LUN to it which had a snapshot of a MySQL database from another system. My standard practice is to turn /var/lib/mysql into a link which points to this SAN mount point. So I did that, and started MySQL …failed. MySQL complained about not having write access to the directory. So I spent the next probably 25 minutes fighting this thing only to discover it was SE Linux that was blocking access to the directory. Disable SE Linux, reboot and MySQL came up fine w/o issue.  #@$#@!$

Yesterday I had another, more interesting encounter with SE Linux. I installed a few CentOS 6.2 systems to put an evaluation of Vertica on. These were all built by hand since we have no automation stuff to deal with CentOS/RH, everything we have is Ubuntu. So I did a bunch of basic things including installing some SSH keys so I could login as root w/my key. Only to find out that didn’t work. No errors in the logs, nothing just rejected my key. I fired up another SSH daemon on another port and my key was accepted no problem. I put the original SSH daemon in debug mode and it gave nothing either just said rejected my key. W T F.

After fighting for another probably 10 minutes I thought, HEY maybe SE Linux is blocking this, and I checked and SE Linux was in enforcing mode. So I disabled it, and rebooted – now SSH works again. I didn’t happen to notice any logs anywhere related to SE Linux and how/why it was blocking this, and only blocking it on port 22 not on any other ports(I tried two other ports), but there you have it, another reason to hate SE Linux.

You can protect your system against the vast majority of threats fairly easily, I mean the last system I dealt with that got compromised was a system that sat out on the internet (with tons of services running) that hadn’t had an OS upgrade in at least 3 years. The system before that I recall was another Linux host(internet-connected as well – it was a firewall) – this time back in 2001 and probably hadn’t had upgrades in a long time either.  The third – a FreeBSD system that was hacked because of me really – I told my friend who ran it to install SSH as he was using telnet to manage it. So he installed SSH and SSH got exploited (back in 2000-2001). I’ve managed probably 900-1000 different hosts over that time frame without an issue. I know there is value in SE Linux, just not in the environments I work in.

Oh and while I’m here, I came across a new feature in CentOS 6.2  yesterday which I’m sure probably applies to RHEL too. When formatting an ext4 file system by default it discards unused blocks. The man page says this is good for thin provisioned file systems and SSDs. Well I’ll tell you it’s not good for thin provisioned file systems, the damn thing sent 300 Megabytes a second of data (450-500,000+ sectors per second according to iostat) to my little storage array with a block size of 2MB (never seen a block size that big before), which had absolutely no benefit other than to flood the storage interfaces and possibly fill up the cache. I ran this on three different VMs at the same time. After a few seconds my front end latency on my storage went from 1.5-3ms to 15-20ms. And the result on the volumes themselves? Nothing, there was no data being written to them. So what’s the point? My point is disable this stupid function with the -K option when running mke2fs on CentOS 6.2. On Ubuntu 10.04 (what we use primarily), it uses ext4 too, but it does not perform this function when a file system is created.

Something that was strange when this operation ran, and I have a question to my 3PAR friends on it – is the performance statistics for the virtual LUN showed absolutely no data flowing through the system, but the performance stats for the volume itself were there(a situation I have never seen before in 6 years on 3PAR), and the performance stats of the fibre channel ports were there, there was no noticeable hit on back end I/O  that I could see, so the controllers were eating it. My only speculation is because RHEL/CentOS 6 has built in support for SCSI UNMAP that these commands were actually UNMAP commands rather than actual data. I’m not sure though.

February 3, 2012

Making the easy stuff hard, the hard stuff possible

Filed under: linux — Tags: — Nate @ 5:16 am

First off, sorry for being away for so long, I’ve been really, really busy preparing a new data center deployment to migrate my company out of the cloud into. The last time I did anything remotely resembling this was in 2007, though this time there are some extra layers involved that I didn’t have back then. It is certainly an interesting experience though configuring the software and infrastructure from the absolute ground up, having nothing to base it off of (other than past experience obviously!). I mean we have our stuff in a public cloud now but there are so many things that are different from an infrastructure perspective that little of it transfers over.

I wanted to write about a sort of topic that I haven’t really written about before. It’s about a systems management tool named Chef from a Seattle-based company named Opscode. It’s supposed to be a next generation tool that is supposed to make your life easier, more advanced than older tools like Puppet and Cfengine.

I’ll start off by saying I have a very strong background in Cfengine, having had used it since late 2004, at three different companies. My techniques and approaches evolved significantly over the years, and my last deployment was quite good in my opinion considering I had to adapt an existing Cfengine deployment made by folks who didn’t know what they were doing into something that worked well, and doing so in a 4 nines environment. That was not easy, as you know one wrong command or config in one of these tools can wreck havoc as I know first hand. I grew to like Cfengine a lot, and there was really nothing that I needed it to do that it couldn’t do for me. I knew it’s limitations well and it was simple to use.

I was introduced to Chef in the summer of 2010 when I went to the headquarters of Opscode and met their senior staff including one of the co-founders I believe. They gave us their powerpoint presentation on what Chef was, how it worked, what it could do, why it exists.

It certainly came across as a very impressive tool, being able to do tons of things that Cfengine could not do, had a lot of concepts that sounded like they could be useful. At the same time however it looked incredibly complicated.

I raised my concerns with their senior staff on that very first day and we had about a 15 minute discussion on it. I’m not a programmer, nor do I ever intend to be. I have a very big line that I refuse to cross from scripting tools in perl & bash to help make my life easier to full on code. A developer at my company constantly jokes that I say I am not a programmer yet I come up with complicated regexes and scripts to do things they don’t understand how to come up with on their own.

They tried to re-assure me that learning Chef is no different than learning the syntax of an Apache configuration file, or DNS or something like that. I didn’t really buy it, but was still willing to give the tool a shot since it sounded like a nice level of systems management that you could achieve with it. I still joke with my co-workers and current boss( who was my boss at the time too) on this very topic, they all remember that conversation to this day.

Chef is written in Ruby, and is very Ruby-centric. I guess you could say I am very biased against Ruby given my past experience supporting Ruby (on Rails) applications.

So here I am, almost 18 months later and things haven’t changed much. My dislike of Ruby continues, and is perhaps even stronger now having used Chef.

My first chef implementation about a year ago was fraught with frustration at almost every turn. I could(and still can) see the promise in the tools it provides the user with but it’s just so difficult to work with especially coming from a Cfengine background(and lack of programming experience) that for my first iteration I dumbed it down a whole bunch, making the logic very Cfengine like, at least as much as I could. I didn’t use any data bags, any attributes, no templates, nothing like that. I had (and still have) a very hard time finding usable examples for many things in Chef. They have a big repository of sample cookbooks – but to me for the most part those are not usable, because while examples they don’t go into details as to specifically, literally what each line of code does. Chef apparently uses this for it’s template language, I looked at it a couple of times – and really I could not make heads or tails of it.

I like to tell people that Chef makes the easy things hard, and the hard things possible. It seems very clear to me that they attacked the hard things in system management first before addressing the easy things. I remember seeing something in their documentation around the concept of the holy grail in the single instance copy, which fits along those lines well. The idea is you have one small bit of code that can be adapted to (m)any environments and situations, using the templates to pull attributes and values from data bags or other sources to make something on the fly.

The concept is novel for sure, coming from a Cfengine background I am very used to duplicating config stanzas, for different environments, making static config files, one for each environment or something like that. I’ve been doing it so long it’s second nature.

Where the opscode folks and I seem to part ways is our priorities. Their priority is to turn the system management into code and automate it to the point where it scales to a million systems. Mine is less ambitious, I want it to be easy to manage and it can scale to a few thousand systems at the most, since going beyond that gets so cookie cutter that it’s not fun anymore. I can certainly see the value of such an approach when dealing with massive environments that are changing all the time. Most companies though this situation doesn’t exist – most companies things are fairly static, you get a new system here and there, you get a new environment maybe once a quarter at the most. Maybe some big project comes along that increases your system count by a large amount for some special purpose.

I have absolutely no problem in maintaining separate config files for each environment and having different config stanzas in the config management tool to push those files out. Not only is this approach simpler (in my view) it gives much more, insight – perhaps is a good word into what is actually happening. I mean if you have a template filled with things that are pulling values dynamically from a half dozen or more different sources you really have no idea what that file really looks like until it lands on the server in question. I like to be able to open the file and look at the settings rather than hunt down the various flags and values that can come from these various sources chef provides.

I’m not building new environments every day, the level of change in general is quite small (as it has been over the past decade at companies I have worked at), I don’t need the level of dynamic abilities that Chef provides because it doesn’t help me that much.

I came up with a new saying a few months ago after dealing with Chef. If it’s not friends with sed, awk and grep then it’s not friends with me. Chef, being very developer-centric uses a lot of JSON to store and manage it’s various configurations. JSON is very much not friendly to sed, awk and grep, and so it frustrates me greatly whenever I have to deal with it.

Because we are moving into a self managed data center environment we needed a way to provision systems. My background is Red Hat/CentOS, Kickstart and Cfengine. We have Ubuntu, <nothing>, and Chef. I came up with a system that for now uses VMware templates (my first ever use of VMware templates) and some custom scripting to integrate with Chef and do other provisioning tasks. It works, it’s not as nice as Kickstart but it works. So speaking of this, and JSON there is a bootstrap process Chef needs to do in order to get itself registered and stuff with the Chef service. This involves creating a bit of JSON that Chef can read. The standard way of Chef bootstrap is a sort of push approach, where there is a management agent that waits for a system to be provisioned, then ssh’s to the system and runs a bunch of stuff. I wanted a pull approach, where the system is provisioned and boots up and configures itself. So I came up with this little bash snippet to construct this JSON file

echo -n "Making first-boot.json ..."
echo -n "{ \"run_list\": [ ">/etc/chef/first-boot.json;
export ROLES=`grep ROLE /root/00-50-* |head -n 1 | sed s'/.*=//'g | sed s'/,/ /'g` &&
for ROLE in $ROLES; do echo -n \"role[${ROLE}]\",;done | sed s'/\,$//'g >>/etc/chef/first-boot.json;
echo -n " ] }" >>/etc/chef/first-boot.json

That /root/00-50-* file is a configuration file named after the MAC address of the VM. This is based on my older kickstart stuff which has been extended to support Chef. It stores things like IP address, Host name, default gateway, for the network, then Chef environment, Chef Role(s), and Chef Organization. It’s a simple text file format, that looks like VARIABLE=value, one VARIABLE per line.

My point with pasting that code is the ugly length I have to go through to simulate valid JSON output using my own regular tool set. Remember I am NOT a programmer!

The scripting works fine(at least so far, built a dozen or so different roles and systems), but it shouldn’t be that complicated.

For those of you more experienced with Vmware templates I noticed there is the ability to customize a template so that Vmware can set the IP address, host name etc of the guest OS. When I saw this I spent a good two hours trying to get it to work, but no matter what I tried Vmware said my configuration was not supported and it would not let me customize. I have read conflicting reports as to whether or not it is possible on Ubuntu. I am running ESX 4.1 with vCenter 5.0. I think if I was running vCenter 4.x it would work fine, but Ubuntu and other “non tier 1” operating system support for template customization is no longer supported in the 5.0 products. Often times when I see “not supported” especially when something used to work, it means that it might work but don’t ask us for help if it blows up. Maybe coincidence or not but as I said no matter what I did, the customization boxes were greyed out and I could not get vCenter 5.0 to work with Ubuntu.

At the end of the day it doesn’t matter though, I had, what was to me at least a good provisioning process I could adapt from my Kickstart days, a process that works well on both physical as well as virtual machines. Something that leverages the MAC address or the serial number(in the case of physical machines) for unique identification.

With regards on how I used to do things with Cfengine, it was simpler than Chef. Cfengine operates more on trust than Chef. Chef uses public/private keys to authenticate systems, and these keys have to be in the right place in order for a system to get registered. This is good for untrusted networks, like public clouds(ugh). Cfengine works more on trust, where you can (or at least I did) assign network ranges where the IPs are trusted, and a new system could just register itself without any special configuration. The keys would be generated automatically and exchanged between cfengine client and server. I had my cfengine configuration, for the most part dynamic based on the host name of the server. Most of my major Cfengine classes ran a simple grep on the file name that had the host name in it, if the host name matched a particular pattern it was automatically included in the right classes. With Chef life is different, I can’t do that. I have to specifically define which role(s) or recipes a system has up front. Because the system will only download cookbooks that it is specifically configured for using. This isn’t a big deal but is an extra step that I’m not used to having to do.

Sample CFengine class defitition:

ENV_CORPDMZ     = ( ReturnsZero(/bin/egrep -q "^HOSTNAME=corpdmz" /etc/sysconfig/network) )

With Cfengine, prior to implementing the hostname-based approach, adding a new server with Cfengine involved manually editing the master cfengine configuration so that it was aware of the new system that was about to come online. I still had to edit this file on occasion, if there were special configs needed for a server, but for the most part, for like systems, web servers and the like I did not.

Which sort of brings me to the next topic – recruiting talent that can use Chef.  I’ve been managing server systems for about 17 years now, wow has it really been that long.  It’s clear to me after 18 months of chef I lack the knowledge to be able to effectively use the tool (though it hasn’t stopped me from using it at this point), but knowing that, and working with people at my previous company with Chef and seeing the tool present them with a similar level of frustration (if not more), I can see Chef being a real sticking point finding talent that is capable of managing it. My company is actively recruiting senior systems people(well one person) and the candidates that I have spoken with so far, along with candidates I have spoken to in the past, I honestly can think of perhaps one or two people over the years that I know that could handle Chef, and one of them is a full time programmer now (when I met him he was hired to be on my operations team back in 2003).

Well short of the co-worker I have now who does quite a wonderful job in deploying and managing Chef, who wrote the vast majority of Chef stuff at my current company. It’s really well done, but even now that a lot of the hard work was done by him, in a very chef-like way I constantly struggle to add new stuff in, or to change existing things because it’s so dynamic. I see a value for something – where is it coming from? is it from the node? environment ? data bag? attribute? something else?

So I see Chef somewhat like I see Hadoop as far as what skill sets are needed and who can provide them. One of my previous companies was working on migrating towards Hadoop and a big complaint I heard from them about Hadoop (and I have heard it from others since) is finding talent that knows the product. With the likes of Yahoo, Google, and other big companies with very deep pockets and big data aspirations they can afford to pay out the wazoo for Hadoop talent, something small companies just can’t compete with. The number of people qualified to do Hadoop right vs the number of people that can do SQL, well it’s obvious, right.

I see the same with Chef. It’s a powerful tool but it’s just not there yet with regards to usability, I can see it being a very useful tool for the likes of those same kinds of companies who manage very huge fleets of systems and have a very dynamic environment. One such place is HP, whom someone I know is going to work for HP Cloud, because he knows Chef. I assume he is probably pretty good at Chef by now, though the caveat with him is he has a strong Ruby programming background. So it’s no real surprise that he could pick Chef up.

I filed several feature requests and bug reports on the Chef support site about a year ago when I was first interacting with it, though I don’t think much made it through. One thing I’d really like is a good way to do in-line editing of text files. At least at the time the Chef mantra was “find another way to do it”, which a friend of mine says is the same thing Puppet people say. So how do I go about adding an entry to /etc/hosts?

Another thing I’d like to be able to do is bulk file copies from the cookbook and preserve ownership and permissions from the source files(e.g. having a directory tree with various owners/groups/permissions and copying it all at once), I don’t think that is possible still. At the time the Opscode people suggested I use rsync for that.

Another thing I’d like is to be able to host cookbooks internally while using the external service for other things. This is mainly for security purposes I feel more at ease when my core data stays within the confines of my network, on systems under my direct control.

Another thing I’d like to see which I have mentioned to Opscode in one way or another as well is a more abstracted configuration language. I think I called it idiot mode or something. The Ruby syntax they use, while I’m sure it’s great for ruby people really sucks for people like me. I’m fine with a reduced subset of functionality that may be provided by idiot mode, because it’s likely that I won’t use that functionality to begin with(at least not initially). Make the learning curve to actually using the tool less steep.

At one point Opscode was interested in talking to me about a full time position being an advocate for their platform. I just couldn’t go through with it, I just can’t get excited about the platform after all the frustration it has given me. I certainly see the promise and will continue trying,  but I think some fundamental things need to be done to the system in order to make it more usable.

So, in the end, I see Chef as a very powerful tool, a very useful tool for those with the skills that can handle the power it gives you. If I were deploying a new environment today I would certainly NOT use Chef, I would use Cfengine. I don’t want to discourage people from using Chef, it is a good tool, just realize the much higher level of investment you need in order to properly leverage it and try to weigh that against the benefits. For me, the hard things that are made possible by Chef really involve a trivial amount of time. I dare say I have spent FAR more time trying to work with Chef on these hard things (understanding the concepts, code etc) than just flat out doing it by hand the old fashioned way.

You might want to ask – why haven’t I tried Puppet? My answer would be – to-date I haven’t had a reason to. I’ve had a few brief discussions with people who use Puppet over the years(including those who have used Cfengine as well) and asked them why should I use Puppet over Cfengine. For the most part the response was there’s nothing really revolutionary in Puppet so if your happy with Cfengine then stick to it. There are a few things Puppet apparently does better (What they are I don’t remember), but in my talks with people there wasn’t anything — anything that made me want to jump on Puppet. There was things that sounded nice (like Chef has), but not enough return to justify the investment in time to make a migration when, as I mentioned earlier Cfengine does pretty much everything I need it to do.

With Cfengine I could probably train a systems person up on the basics in literally an afternoon. My Cfengine configurations were not complicated. With Chef, well here I am at 18 months and still lost.

3,400 words, I think that’s a record for me for a published blog post. Should get back to sleep now, started writing this at about 3:30AM.

Older Posts »

Powered by WordPress