I don’t write too much about Linux either but this is sort of technical I guess.
I’ve never been a fan of SELinux. I’m sure it’s great if your in the NSA, or the FBI, or some other 3 letter agency, but for most of the rest of the people it’s a needless pain to deal with, and provides little benefit.
I remember many moons ago back when I dealt with NT4, encountering situations where I, as an administrator could not access a file on the NTFS file system. It made no sense – I am administrator – get me access to that file – but no, I could not get access. HOWEVER, I could change the security settings and take ownership of the file NOW I can get access. Since I have that right to begin with it should just give me access and not make me jump through those hoops. That’s what I think at least. I recall someone telling me back in the 90s that Netware was similar and even went to further extremes where you could lock the admin out of files entirely, and in order to back data up you had another backup user which the backup program used and that was somehow protected too. I can certainly understand the use case, but it certainly makes things frustrating. I’ve never been at a company that needed anywhere remotely that level of control (I go out of my way to avoid them actually since I’m sure that’s only a small part of the frustrations of being there).
On the same token I have never used (for more than a few minutes anyways) file system ACLs on Linux/Unix platforms either. I really like the basic permissions system it works for 99.9% of my own use cases over the years, and is very simple to manage.
I had a more recent experience that was similar, but even more frustrating on Windows 7. I wanted to copy a couple files into the system32 directory, but no matter what I did (including take ownership, change permissions etc) it would not let me do it. It’s my #$#@ computer you piece of #@$#@.
Such frustration is not limited to Windows however, Linux has it’s own similar functionality called SE Linux, which by default is turned on in many situations. I turn it off everywhere, so when I encounter it I am not expecting it to be on, and the resulting frustration is annoying to say the least.
A couple weeks ago I installed a test MySQL server, and exposed a LUN to it which had a snapshot of a MySQL database from another system. My standard practice is to turn /var/lib/mysql into a link which points to this SAN mount point. So I did that, and started MySQL …failed. MySQL complained about not having write access to the directory. So I spent the next probably 25 minutes fighting this thing only to discover it was SE Linux that was blocking access to the directory. Disable SE Linux, reboot and MySQL came up fine w/o issue. #@$#@!$
Yesterday I had another, more interesting encounter with SE Linux. I installed a few CentOS 6.2 systems to put an evaluation of Vertica on. These were all built by hand since we have no automation stuff to deal with CentOS/RH, everything we have is Ubuntu. So I did a bunch of basic things including installing some SSH keys so I could login as root w/my key. Only to find out that didn’t work. No errors in the logs, nothing just rejected my key. I fired up another SSH daemon on another port and my key was accepted no problem. I put the original SSH daemon in debug mode and it gave nothing either just said rejected my key. W T F.
After fighting for another probably 10 minutes I thought, HEY maybe SE Linux is blocking this, and I checked and SE Linux was in enforcing mode. So I disabled it, and rebooted – now SSH works again. I didn’t happen to notice any logs anywhere related to SE Linux and how/why it was blocking this, and only blocking it on port 22 not on any other ports(I tried two other ports), but there you have it, another reason to hate SE Linux.
You can protect your system against the vast majority of threats fairly easily, I mean the last system I dealt with that got compromised was a system that sat out on the internet (with tons of services running) that hadn’t had an OS upgrade in at least 3 years. The system before that I recall was another Linux host(internet-connected as well – it was a firewall) – this time back in 2001 and probably hadn’t had upgrades in a long time either. The third – a FreeBSD system that was hacked because of me really – I told my friend who ran it to install SSH as he was using telnet to manage it. So he installed SSH and SSH got exploited (back in 2000-2001). I’ve managed probably 900-1000 different hosts over that time frame without an issue. I know there is value in SE Linux, just not in the environments I work in.
Oh and while I’m here, I came across a new feature in CentOS 6.2Â yesterday which I’m sure probably applies to RHEL too. When formatting an ext4 file system by default it discards unused blocks. The man page says this is good for thin provisioned file systems and SSDs. Well I’ll tell you it’s not good for thin provisioned file systems, the damn thing sent 300 Megabytes a second of data (450-500,000+ sectors per second according to iostat) to my little storage array with a block size of 2MB (never seen a block size that big before), which had absolutely no benefit other than to flood the storage interfaces and possibly fill up the cache. I ran this on three different VMs at the same time. After a few seconds my front end latency on my storage went from 1.5-3ms to 15-20ms. And the result on the volumes themselves? Nothing, there was no data being written to them. So what’s the point? My point is disable this stupid function with the -K option when running mke2fs on CentOS 6.2. On Ubuntu 10.04 (what we use primarily), it uses ext4 too, but it does not perform this function when a file system is created.
Something that was strange when this operation ran, and I have a question to my 3PAR friends on it – is the performance statistics for the virtual LUN showed absolutely no data flowing through the system, but the performance stats for the volume itself were there(a situation I have never seen before in 6 years on 3PAR), and the performance stats of the fibre channel ports were there, there was no noticeable hit on back end I/OÂ that I could see, so the controllers were eating it. My only speculation is because RHEL/CentOS 6 has built in support for SCSI UNMAP that these commands were actually UNMAP commands rather than actual data. I’m not sure though.