TechOpsGuys.com

February 6, 2011

Debian 6.0 Released

Filed under: linux,News — Nate @ 8:49 pm

I’ve been using Debian since version 2.0 back in 1998, it is still my favorite distribution for systems that I maintain by hand (instead of using fancier automated tools) mainly because of the vast number of packages and ease of integration. I still prefer Red Hat for “work” stuff, and anything larger than small scale installations.

Anyways I happened to peek in on the progress they were making a few days ago, and they were down to something like 7 release critical bugs, so I was kinda-sorta excited that another new revision was coming out. I remember some of the leader(s) back in 2009 set some pretty aggressive targets for this version of Debian, like most people out there I just laughed and knew it wasn’t achievable.Â I’m patient, release when it’s ready, not before. Debian was pretty quick to say they weren’t official targets(I believe) more like best effort estimates. For some reason this particular Debian press release is not on their main site, maybe a hiccup in the site-redesign, as the news from 2009 page shows a bunch of stuff from 2008.

Almost a year after that original goal, Debian 6.0 is here. To be honest I’m not sure what all is really new, I mean of course there’s a lot of updated packages and stuff, but, I suppose for me Linux has pretty much gotten to the point where it’s good enough for me, I mean the only thing I really look forward to in upgrades is better hardware support (and even then that’s just for my own laptops/desktops etc, otherwise everything I run is in a virtual machine and hardware support hasn’t been an issue there ever).

Normally I’m not one to upgrade right away, but today was a different day, maybe it was the weather, maybe it was just waiting for the Super Bowl to come on (watching it now, paused on Tivo while I write this). But I decided to upgrade my workstation at home today, more than 1,000 package updates, and for the first time in a decade the installation instructions recommended a reboot mid-upgrade. The upgrade went off without a hitch, my desktop isn’t customized much, re-installed my Nivida driver, told VMware Workstation to rebuild it’s kernel drivers, fired off X, and then I went back to my laptop(my workstation is connected to my TV so I have to decide which input I want to use, I’d like my next TV to have picture in picture if any TVs out there anymore have that ability it was pretty popular back in the ..80s?).

My workstation, for reference:

HP xw9400 Workstation
2 x Opteron 2380 CPUs (8 cores total)
12GB ECC memory
Nvidia Geforce GT 240 (what lspci says at least)
3Ware 9650 SE SATA RAID controller with battery backed write back cache
4x2TB Western Digital green(I think) drives in RAID 1+0
1x64GB Corsair SSD (forgot what type) for OS

I got a really good deal on the base system at the time, bought it through HP’s refurb dept, for a configuration that retailed brand new on their own site for about $5,000 (note that is not the above config I have added a bunch to it), my cost was about $1,500, and that included a 3 year warranty. I wanted something that should last a good long time, and of course it’s connected to an APC Smart UPS, gotta have that sine wave power…

I have had my eye on Debian‘s kFreeBSD port for some time and I decided what the hell let’s try that out too. I have two Soekris boxes (one is backup), so I took the one that was not in use and put a fresh compact flash card in there and poked around for how to install Debian kFreeBSD on it, because you know I hate BSD userland but really like to use pf.

First off, I did get it working..eventually!

kFreeBSD is a technology preview, not a fully supported release, so it is rough around the edges. Documentation for what I wanted to do was sparse at best and there seemed to be only one other person trying this out on a Sokeris box, so the mailing list thread from nearly a year ago was helpful.

Official Documentation was lacking in a few areas:

Documentation on how to setup the tftp server was mostly good, except it wasn’t exactly easy to find the files to use, I had to poke around quite a bit to find them.
No documentation on how to enable serial console for the installer, there was no mention of serial console at all except for here, and no mention on how to set those various variables.
- For those that want to know you need to edit the grub.cfg (Debian 6.0 uses Grub 2 now, which I guess is good but it’s more confusing to me), and add the parameters -D -h to the kernel line, example:

menuentry "Default install" {
 echo "Loading ..."
 kfreebsd /kfreebsd.gz -D -h
 kfreebsd_module /initrd.gz type=mfs_root
 set kFreeBSD.vfs.root.mountfrom=ufs:/dev/md0
 set DEBIAN_FRONTEND=text
}

I tried setting the DEBIAN_FRONTEND variable as you can see, but it didn’t seem to do anything, the installer behavior was unchanged from the default.

Took me a significant amount of time to figure out I could not use minicom to install Debian kFreeBSD, instead I had to use cu (something that I’ve never used before). I’ve used minicom for everything from switches, to routers, to load balancers, to OpenBSD installs, to Red Hat Linux installs (I have never tried to install Debian over serial until today). But on Debian kFreeBSD the terminal emulation is not compatible between minicom and the installer, the result is I could never get past the assign a host name screen, it just kept sending random escape characters setting me back to previous screens, it was pretty frustrating.

Since there is no VGA port on the Soekris I did a tftp net install over serial console, when it came to installing the various base packages it took forever. I think at least part of it is due to the CF card being put in PIO mode instead of DMA mode, though looking at my OpenBSD Sokeris system it says it is using PIO mode 4 too. I am using the same model and size of CF card in both systems, I specifically used this one (Lexar 1GB, have had it for 5-6 years) because it seemed to run really fast on my systems vs my Kingston CF cards ran like dogs. Anyways it took upwards of two hours to install the base packages(around ~400MB installed). Doing the same in a VMware VM took about 5 minutes tops(much faster system mind you..)

I chose to install the base operating system along with the SSH option (which I swear was “SSH server”). And everything installed.

Then I rebooted and was greeted to a blank screen where GRUB should be. It took a little time to figure it out but I managed to edit the PXE grub configuration so that it would boot my local CF card over serial port.

So there we go , the kFreeBSD kernel is booting on Soekris –

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
 The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
#0 Tue JanÂ  4 16:41:50 UTC 2011 i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Geode(TM) Integrated Processor by AMD PCS (499.91-MHz 586-class CPU)
 Origin = "AuthenticAMD"Â  Id = 0x5a2Â  Family = 5Â  Model = aÂ  Stepping = 2
 Features=0x88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CLFLUSH,MMX>
 AMD Features=0xc0400000<MMX+,3DNow!+,3DNow!>
real memoryÂ  = 536870912 (512 MB)
avail memory = 511774720 (488 MB)
module_register_init: MOD_LOAD (vesa, 0xc0952d8e, 0) error 19
kbd1 at kbdmux0
K6-family MTRR support enabled (2 registers)
ACPI Error: A valid RSDP was not found (20100331/tbxfroot-309)
ACPI: Table initialisation failed: AE_NOT_FOUND
ACPI: Try disabling either ACPI or apic support.
pcib0: <Host to PCI bridge> pcibus 0 on motherboard
[..]

And a bunch of services started, including PostgreSQL (?!?!), and then it just sat there. No login prompt.

I could ping it but could not ssh to the system, the only port open was port mapper. I told it to install SSH related things(I forgot exactly what the menu option was but find it hard to believe that there would be an openssh client option and not a server option I can go back and look, maybe later).

So, now I was stuck.. I rebooted back into the installer and had some trouble mounting the CF card in the rescue shell but managed to do it, I chroot’d into the mount point, enabled the serial console per the examples in /etc/inittab, and used apt-get to install openssh — only that failed, some things weren’t properly configured in order for the ssh setup to complete. So I thought..and thought…

Telnet to the rescue! I haven’t used telnet on a server in I don’t know how many years probably since I worked at a Unix software company in 2002 where we had a bunch of different unixes and most did not have ssh. Anyways I installed telnet on the system via chroot, unmounted the file system, rebooted and the system came up — but still no login prompt on the serial console. Fortunately I was able to telnet to the thing, and install ssh along with a few other packages, and removed PostgreSQL, I do not want to run a SQL database on this tiny machine.

I did more futzing around trying to get DMA enabled on the CF card to see if that would make it go faster to no avail. top does not report any i/o wait but I think that is a compatibility issue rather than there not being any i/o wait on the system.

After poking around more I determined why the login prompt wasn’t appearing on the serial console, it’s because the examples in the /etc/inittab were not right, at least not for Soekris, I can’t speak to other platforms. But it mentions using /dev/ttyd0 when in fact I have to use /dev/ttyu0. Oh and another thing on serial console and this kFreeBSD, from what I read setting a custom baud rate (other than default 9600) is difficult if not impossible, I have not tried, so instead I changed the Sokeris default baud rate from 19200 to 9600.

I also did some hand editing of grub.cfg to enable serial console in grub and stuff, because I was unable to figure out how to do it in the grub v2 templates.

So all in all, certainly feels like a technology preview, very very very rough around the edges, I’m sure it will get there in time. My own needs are really minimal, I run a tiny set of infrastructure services on my home firewalls like dhcp, dns, OpenVPN, Network UPS Tools and stuff, no desktop, no web servers, nothing fancy, So I can probably use this to replace my OpenBSD system, I will test pf out maybe next weekend, spent enough time on it for now.

root@ksentry:~# cat /etc/debian_version
6.0
root@ksentry:~# uname -a
GNU/kFreeBSD ksentry 8.1-1-486 #0 Tue JanÂ  4 16:41:50 UTC 2011 i586 i386 Geode(TM) Integrated Processor by AMD PCS GNU/kFreeBSD

Comments Off

February 2, 2011

Oh no! We Ran outta IPs yesterday!

Filed under: Networking,Random Thought — Nate @ 9:37 pm

The Register put it better than I could put it

World shrugs as IPv4 addresses finally exhausted

Count me among those that shrugged, commented on this topic a few months ago.

Comments (2)

January 31, 2011

Terremark snatched by Verizon

Filed under: General,Virtualization — Tags: cloud, terremark — Nate @ 9:34 pm

Sorry for my three readers out there for not posting recently I’ve been pretty busy! And to me there hasn’t been too much events in the tech world in the past month or so that have gotten me interested enough to write about them.

One recent event that did was Verizon’s acquisition of Terremark, a service I started using about a year ago.

I was talking with a friend of mine recently he was thinking about either throwing a 1U server in a local co-location or play around with one of the cloud service providers. Since I am doing both still (been too lazy to completely move out of the co-lo…) I gave him my own thoughts, and it sort of made me think about more about the cloud in general.

What do I expect from a cloud?

When I’m talking cloud I’m mainly referring to the IaaS or Infrastructure as a Service. Setting aside cost modelling and stuff forÂ a moment here I expect the IaaS to more or less just work. I don’t want to have to care about:

Power supply failure
Server failure
Disk drive failure
Disk controller failure
Scheduled maintenance (e.g. host server upgrades either software or hardware, or fixes etc)
Network failure
UPS failure
Generator failure
Dare I say it ? A fire in the data center?
And I absolutely want to be able to run what ever operating system I want, and manage it the same way I would manage it if it was sitting on a table in my room or office. That means boot from an ISO image and install like I would anything else.

Hosting it yourself

I’ve been running my own servers for my own personal use since the mid 90s. I like the level of control it gives me and the amount of flexibility I have with running my own stuff. Also gives me a playground on the internet where I can do things. After multiple power outages over the first part of the decade, one of which lasted 28 hours, and the acquisition of my DSL provider for the ~5th time, I decided to go co-lo. I already had a server and I put it in a local, Tier 2 or Tier 3 data center. I could not find a local Tier 4 data center that would lease me 1U of space. So I lacked:

Redundant Power
Redundant Cooling
Redundant Network
Redundant Servers (if my server chokes hard I’m looking at days to a week+ of downtime here)

For the most part I guess I had been lucky, the facility had one, maybe two outages since I moved in about three years ago. The bigger issue with my server was aging and the disks were failing, it was a pain to replace them and it wasn’t going to be cheap to replace the system with something modern and capable of running ESXi in a supported configuration(my estimates put the cost at a minimum of $4k). Add to thatÂ the fact that I need such a tiny amount of server resources.

Doing it right

So I had heard of Terremark from my friends over at 3PAR, and you know I like 3PAR, and they use Vmware and I like Vmware. So I decided to go with them rather than the other providers out there, they had a decent user interface and I got up and going fairly quickly.

So I’ve been running it for almost a year, with pretty much no issues, I wish they had a bit more flexibility in the way they provision networking stuff but nothing is perfect (well unless you have the ability to do it yourself).

From a design perspective, Terremark has done it right, whether it’s providing an easy to use interface to provision systems, using advanced technology such as VMware, 3PAR, and Netscaler load balancers, and building their data centers to be even — fire proof.

Having the ability to do things like Vmotion, or Storage vMotion is just absolutely critical for a service provider, I can’t imagine anyone being able to run a cloud without such functionality at least with a diverse set of customers. Having things like 3PAR’s persistent cache is critical as well to keep performance up in the event of planned or unplanned downtime in the storage controllers.

I look forward to the day where the level of instrumentation and reporting in the hypervisors allow billing based on actual usage, rather than what is being provisioned up front.

Sample capabilities

In case your a less technical user I wanted to outline a few of the abilities the technology Terremark uses offers their customers –

Memory Chip Failure (or any server component failure or change)

Most modern servers have sensors on them and for the most part are able to accurately predict when a memory chip is behaving badly and to warn the operator of the machine to replace it. But unless your running on some very high end specialized equipment (which I assume Terremark is not because it would cost too much for their customers to bare), the operator needs to take the system off line in order to replace the bad hardware. So what do they do? They tell VMware to move all of the customer virtual machines off the affected server onto other servers, this is done without customer impact, the customer never knows this is going on. The operator can then take the machine off line and replace the faulty components and then reverse the process.

Same applies to if you need to:

Perform firmware or BIOS updates/changes
Perform Hypervisor updates/patches
Maybe your retiring an older type of server and moving to a more modern system

Disk failure

This one is pretty simple, a disk fails in the storage system and the vendor is dispatched to replace it, usually within four hours. But they may opt to wait a longer period of time for whatever reason, with 3PAR it doesn’t really matter, there are no dedicated hot spares so your really in no danger of losing redundancy, the system rebuilds quickly using a many:many RAID relationship, and is fully redundant once again in a matter of hours(vs days with older systems and whole-disk-based RAID).

Storage controller software upgrade

There are fairly routine software upgrades on modern storage systems, the software feature set seems to just grow and grow. So the ability to perform the upgrade without disrupting users for too long(maybe a few seconds) is really important with a diverse set of customers, because there will probably be no good time where all customers say ok I have have some downtime. So having high availability storage with the ability to maintain performance with a controller being off line by mirroring the cache elsewhere is a very useful feature to have.

Storage system upgrade (add capacity)

Being able to add capacity without disruption and dynamically re-distribute all existing user data across all new as well as current disk resources on-line to maximize performance is a boon for customers as well.

UPS failure (or power strip/PDU failure)

Unlike the small dinky UPS you may have in your house or office UPSs in data centers typically are powering up to several hundred machines, so if it fails then you may be in for some trouble. But with redundant power you have little to worry about, the other power supply takes over without interruption.

If a server power supply blows up it has the ability to take out the entire branch or even whole circuit that it’s connected to. But once again redundant power saves the day.

Uh-oh I screwed up the network configuration!

Well now you’ve done it, you hosed the network (or maybe for some reason your system just dropped off the network maybe flakey network driver or something) and you can’t connect to your system via SSH or RDP or whatever you were using. Fear not, establish a VPN to the Terremark servers and you can get console access to your system. If only the console worked from Firefox on Linux..can’t have everything I guess. Maybe they will introduce support for vSphere 4.1’s virtual serial concentrators soon.

It just works

There are some applications out there that don’t need the level of reliability that the infrastructure Terremark uses can provide and they prefer to distribute things over many machines or many data centers or something, that’s fine too, but most apps, almost all apps in fact make the same common assumption, perhaps you can call it the lazy assumption – they assume that it will just work. Which shouldn’t surprise many, because achieving that level of reliability at the application layer alone is an incredibly complex task to pull off. So instead you have multiple layers of reliability under the application handling a subset of availability, layers that have been evolving for years or decades even in some cases.

Terremark just works. I’m sure there are other cloud service providers out there that work too, I haven’t used them all by any stretch(nor am I seeking them for that matter).

Public clouds make sense, as I’ve talked about in the past for a subset of functionality, they have a very long ways to go in order to replace what you can build yourself in a private cloud (assuming anyone ever gets there). For my own use case, this solution works.

Comments (7)

December 12, 2010

Dell and Exanet: MIA

Filed under: Storage — Tags: dell, exanet — Nate @ 9:37 pm

The thoughts around Dell buying Compellent made me think back to Dell’s acquistiion of the IP and some engineering employees of Exanet, as The Register put it, a crashed NAS company.

I was a customer and user of Exanet gear for more than a year, and at least in my experience it was a solid product, very easy to use, decent performance and scalable. The back end architecture to some extent mirrored the 3PAR hardware-based architecture but in software, really a good design in my opinion.

Basic Exanet Architecture

Their standard server at the time they went under was a IBM x3650, dual proc quad core Intel Xeon 5500-based platform with 24GB of memory.

Each server had multiple software processes called fsds or File system daemons, that ran, they ran one fsd per core. Each fsd was responsible for a portion of the file system (x number of files), they load balanced it quite well I never had to manually re-balance or anything. Each fsd was allocated its own memory space used for itself as well as cache, if I recall right the default was around 1.6GB per fsd.

Each NAS head unit had back end connectivity to all of the other NAS units in the cluster(minimum 2, maximum tested at the time they went under was 16). A request for a file could come in on any node, any link. If the file wasn’t home to that node it would transparently forward the request to the right node/fsd to service the request on the back end. Much like how 3PAR’s backplane forwards requests between controllers.

Standard for back end network was 10Gbps on their last models.

As far as data protection, the use of “commodity” servers did have one downside, they had to use UPS systems as their battery backup to ensure enough time for the nodes to shut down cleanly in the event of a power failure. This could present problems at some data centers as operating a UPS in your own rack can be complicated from a co-location point of view(think EPO etc). Another similar design that Exanet had compared to 3PAR is their use of internal disks to flush cache to, which is something I suppose Exanet was forced into doing, other storage manufacturers use battery backed cache in order to survive power outages of some duration. But both Exanet and 3PAR dump their cache to an internal disk so that the power outage can last for a day, a week, or even a month and it won’t matter, data itnegrity is not compromised.

32-bit platform

The only thing that held it back was they didn’t have enough time or resources to make the system fully 64-bit before they went under, that would of unlocked a whole lot of additional performance they could of gotten. Being locked into a 32-bit OS really limited what they could do on a single node, and as processors became ever more powerful they really had to make the jump to 64-bit.

Exanet was entirely based on “commodity” hardware, not only were they using x86 CPUs but their NAS controllers were IBM 2U rackmount servers running CentOS 4.4 or 4.5 if I recall right.

To me, as previous posts have implied, if your going to base your stuff on x86 CPUs, go all out, it’s cheap anyways. I would of loved to have seen a 32-48 core Exanet NAS controller with 512GB-1TB of memory on it.

Back to Dell

Dell originally went into talks with Exanet a while back because Exanet was willing to certify Equallogic storage as a back end provider of disk to an Exanet cluster, using iSCSI inbetween the Exanet cluster and the Equallogic storage. Since nobody else in the indusry seemed willing to have their NAS solution talk to a back end iSCSI system. As far as I know the basic qualifications for this solution was completed in 2009, quite a ways before they ran out of cash.

Why did Exanet go under? I believe primarily because the market they were playing in was too small with too few players in it, not enough deals to go around, so whomever had the most resources to outlast the rest would come out on top, in this case I believe it was Isilon, even though they too were taken out by EMC from the looks of their growth it didn’t seem like they were in a fine position to continue to operate independently. With Ibrix and Polyserve going to HP, Onstor going to LSI, and I’m still convinced BlueArc will go to HDS at some point(they are once again filing for IPO but word on the street is they aren’t in very good shape), I suspect after they fail to IPO and go under. They have a very nice NAS platform, but HDS has their hands tied in supporting 3rd party storage other than HDS product, BlueArc OEM’s LSI storage like so many others.

About a year ago SGI OEM’d one of BlueArc’s products though recently I have looked around the SGI site and see no mention of it. Either they have abandoned it (more likely) or are just really quiet. Since I know SGI is also a big LSI shop I wonder if they are making the switch to Onstor. One industry insider I know suspects LSI is working on integrating the Onstor technology directly into their storage systems rather than having an independent head unit, which makes sense if they can make it work.

But really my question is why hasn’t Dell announced anything related to the Exanet technology? They could of, quite possibly within a week or two had a system running and certified on Dell PowerEdge equipment and selling to both existing Exanet customers as well as new ones. The technology worked fine, it was really easy to setup and use, and it’s not as if Dell has another solution in house that competes with it. AND since it was an entirely software based solution there was really no costs involved in manufacturing. Exanet had more than one PB-sized deal in the works at the time they went under, that’s a lot of good will Dell just threw away. But hey, what do you expect, it’s Dell. Thankfully they didn’t get their dirty paws on 3PAR.

When I looked at how a NetApp system was managed compared to the Exanet my only response was You’re kidding, right?

Time will tell if anything ever comes of the technology.

I really wanted 3PAR to buy them of course, they were very close partners with 3PAR and both pitched each other’s products at every opportunitiy. Exanet would go out of their way to push 3PAR storage whenever possible because they knew how much trouble the LSI storage could be, and they were happy to get double the performance per spindle off 3PAR vs LSI. But I never did get an adequate answer out of 3PAR as to why they did not pursue Exanet, they were in the early running but pulled out for whatever reason, the price tag of less then $15M was a steal.

Now that 3PAR is with HP we’ll see what they can do with Ibrix, I knew of more than one customer that migrated off of things like Ibrix and Onstor to Exanet, HP has been pretty silent about Ibrix since they bought them as far as I know. I have no idea how much R&D they have pumped into it over the years or what their plans might be.

Comments (4)

Dell going after Compellent

Filed under: Storage — Tags: compellent, dell — Nate @ 12:26 am

I know this first made news a couple of days ago but I can’t tell you how busy I’ve been recently. It seems like after Dell got reamed by HP in the 3PAR bidding war they are going after Compellent,Â one of the only other storage technology companies utilizing distributed RAID, and as far as I know the main pioneer of automagic storage tiering.

This time around nobody else is expected to bid, it seems the stock speculators were a bit disappointed when the talks were announced as they had already bid the stock up far higher than what is being discussed as being the acquisition price.

While their previous generation of controllers seemed rather weak, their latest and greatest look to be a pretty sizable step up, and apparently can be leveraged by their existing customers, no need to buy a new storage system.

I can’t wait to see how EMC responds myself. Dell must be really frustrated with them to go after Compellent so soon after losing 3PAR.

Comments (1)

OpenBSD installer: party like it’s 2000

Filed under: linux,Random Thought,Security — Tags: debian, linux, openbsd — Nate @ 12:07 am

[Random Thought] The original title was going to be “OpenBSD: only trivial changes in the installer in one heck of a long time” a take off of their blurb on their site about remote exploits in the default install.

I like OpenBSD, well I like it as a firewall — I love pf. I’ve used ipchains, iptables, ipfwadm, ipf (which I think pf was originally based off of and was spawned due to a licensing dispute with the ipf author(s)), ipfw, Cisco PIX and probably one or two more firewall interfaces, and pf is far and away the best that I’ve come across.Â I absolutely detest Linux’s firewall interfaces by contrast, going all the way back almost 15 years now.

I do hate the OpenBSD user land tools though, probably as much as the *BSD folks hate the Linux user land tools. I mean how hard is it to include an init script of sorts to start and stop a service? But I do love pf, so in situations where I need a firewall I tend to opt for OpenBSD wherever possible (when not possible I don’t resort to Linux, I’d rather resort to a commercial solution perhaps a Juniper Netscreen or something).

But this isn’t about pf, or user land. This is about the OpenBSD installer. I swear it’s had only the most trivial changes and improvements done to it in at least the past 10 years, when I first decided to try it out. To me it is sad, the worst part about it is of course the disk partitioning interface. It’s just horrible.

I picked up my 2nd Soekris net5501 system and installed OpenBSD 4.8 on it this afternoon, and was kind of sadened, yet not surprised how it still hasn’t changed. I have my other Soekris running OpenBSD 4.4 and has been running for a couple years now. First used pf I believe back in about 2004 or so, so have been running it quite a while, nothing too complicated, it’s really simple to understand and manage. My first experience with OpenBSD was I believe back in 2000, I’m not sure but I want to say it was something like v2.8. I didn’t get very far with it, for some reason it would kernel panic on our hardware after about a day or so of very light activity, so went back to Linux.

I know pf has been ported to FreeBSD, and there is soon to be a fully supported Debian kFreeBSD distribution with the next major release of Debian whenever that is, so perhaps that will be worth while switching to for my pf needs, I don’t know. Debian is another system which has been criticized over the years for having a rough installer, though I got to say in the past 4-5 years it really has gotten to be a good installer in my opinion. As a Debian user for more than 12 years now it hasn’t given me a reason to switch away from it, but I still do prefer Red Hat based distros for “work” stuff.

First impressions are important, and the installer is that first impression. While I am not holding out hope they will improve their installer, it would be nice.

Comments (3)

December 9, 2010

Java fallout from Oracle acquisition intensifies

Filed under: News,Random Thought — Tags: java, oracle — Nate @ 1:51 pm

I was worried about this myself, almost a year ago to the day raised my concerns about Oracle getting control of Java, and the fallout continues. Oracle already had BEA’s JRockit, it’s too bad they had to get Sun’s JVM too.

Apache seems to have withdrawn from most things related to Java today according to our friends at The Register.

On Thursday, the ASF submitted its resignation from JCP’s Java Standard and Enterprise Edition (SE/EE) Executive Committee as a direct consequence of the Java Community Process (JCP) vote to approve Oracle’s roadmap for Java 7 and 8.

The ASF said it’s removing all official representatives from all JSRs and will refuse to renew its JCP membership and EC position.

Java was too important a technology to be put in the hands of Oracle.

Too bad..

Comments (2)

November 24, 2010

More inefficient storage

Filed under: Storage — Tags: capacityutilization, fulldisclosure, spc-1 — Nate @ 8:32 am

Another random thought, got woken up this morning and wound up checking what’s new on SPC-1, and a couple weeks ago the Chinese company Huawei posted results for their Oceanspace 8100 8-node storage system. This system seems to be similar to the likes of HDS USP/VSP, IBM SVC in that it has the ability to virtualize other storage systems behind it. The system is powered by 32 quad core processors or 128 CPU cores.

The thing that caught my eye is in every SPC-1 disclosure is the paragraph

Unused Storage Ratio: Total Unused Capacity (XXX GB) divided by Physical
Storage Capacity (XXX GB) and may not exceed 45%.

So what is Huawei’s Unused storage ratio? – 44.77%

I wonder how hard it was for them to get under the 45% limit, I bet they were probably at 55-60% and had to yank a bunch of drives out or something to decrease their ratio.

From their full disclosure document it appears their tested system has roughly 261TB of unused storage on it. That’s pretty bad, 3PAR F400 has a mere 75GB of unused capacity (0.14%) by contrast. The bigger T800 has roughly 21TB of unused capacity (15%).

One would think, that for Huawei, they would be better off using 146GB disks instead of the 300GB, 450GB and 600GB disks (another question is what is the point in mismatched disks for this test, maybe they didn’t have enough of one drive type which would be odd for a drive array manufacturer – maybe they mixed drive types to drive the unused capacity perhaps after having started with nothing but 600GB disks).

Speaking of drive sizes, one company I know well has a lot of big Oracle databases and are I/O bound more than space bound, so it benefits them to use smaller disk drives, their current array manufacturer no longer offers 146GB disk drives so they are forced to pay quite a bit more for the bigger disks.

Lots of IOPS to be sure, 300,000 of them (260 IOPS per drive) and 320GB of cache (see note below!), but certainly seems that you could do this a better way..

Looking deeper into the full disclosure documents(Appendix C page 64) for the Huawei system reveals this little gem

The creatlun command creates a LUN with a capacity 1,716,606 MiB. The -p 0 parameter, in the creatlun command sets the read cache policy as no prefetch and the -m 0 parameter sets the write cache policy as write cache with no mirroring.

So they seem to be effectively disabling the read cache and disabling cache mirroring making all cache a write back cache that is not protected? I would imagine they ran the test and found their read cache ineffective so disabled it and devoted it to write cache and re-ran the test.

Submitting results without mirrored cache seems, well misleading to say the least. Glad there is full disclosure!

The approximate cost of the Huawei system seems to be about $2.2 million according to the google exchange rate.

While I am here, what is it with 8 node storage systems? What is magical about that number? I’ve seen a bunch of different ones both SAN and NAS that top out at eight. Not 10? not 6? Seems a strange coincidence, and has always bugged me for some reason.

Comments (5)

November 16, 2010

HP serious about blade networking

Filed under: Networking — Nate @ 10:32 am

I was doing my rounds and noticed that HP launched a new blade for the Xeon 6500/7500 processor( I don’t yet see news breaking of this on The Reg so I beat them for once!), the BL620c G7, and they have another blade the BL680c G7,Â is a double wide solution, which to me looks like nothing more than a pair of 620c G7s stacked together and using the backplane to link the systems together, IBM does something similar on their Bladecenter to connect a memory expansion blade onto their HX5 blade.

But what really caught my eye more than anything else is how much networking HP is including on their latest blades, whether it is the BL685c G7, or these two newer systems.

BL685c G7 & BL620c G7 both include 4 x 10GbE Flexfabric ports on board (no need to use expansion ports) – that is up to 16 FlexNICs per server – with three expansion slots you can get a max of 10x10GbE ports per server (or 40 FlexNICs per server)
BL680c G7 has 6 x 10GbE Flexfabric ports on board providing up to 24 FlexNICs per server – with seven expansion slots you can get a max of 20x10GbE ports per server (or 80 FlexNICs per server)

Side note: Flex Fabric is HP’s term referring to CNA.

Looking at the stock networking from Cisco, Dell, and IBM

Cisco – their site is complex as usual but from what I can make out their B230M1 blade has 2x10Gbps CNAs
Dell and IBM are stuck in 1GbE land, with IBM providing 2x1GbE on their HX5 and Dell providing 4x1GbE on their M910

What is even nicer about the extra NICs on the HP side, at least on the BL685c G7 and I presume the BL620c G7 is that because they are full height, the connections from the extra 2x10GbE ports on the blade feed into the same slots on the backplane, meaning with a single pair of 10GbE modules on the chassis you can get full 4x10GbE per server (8 full height blades per chassis), normally if you would put extra NICs on the expansion ports, those ports are wired to different slots in the back needing additional networking components in those slots.

You might be asking yourself, what if you don’t have 10GbE and you only have 1GbE networking? Well first off – upgrade, 10GbE is dirt cheap now there is absolutely no excuse for getting these new higher end blade systems and trying to run them off 1GbE. You’re only hurting yourself by attempting it. But in the worst case you really don’t know what your doing and you happen to get these HP blades with 10GbE on them and want to connect them to 1GbE switches — well you can, they are backwards compatible with 1GbE switches. Either with their various 1GbE modules, or the 10GbE pass through module supporting both SFP and SFP+ optics.

So there you have it, 4x10GbE ports per blade standard, if it was me I would take 1 port from each network ASIC, and assign FlexNICs for VM traffic, and take the other port from each ASIC and enable jumbo frames for things like Vmotion, Fault tolerance, iSCSI, NFS etc traffic. I’m sure the cost of adding the extra dual port card is trivial when integrated onto the board, and HP is smart enough to recognize that!

Having more FlexNICs on board means you can use those expansion slots for other things, such as Fusion I/O accelerators, or maybe Infiniband or native Fibre channel connectivity. Having more FlexNICs on board also allows for greater flexibility in network configuration of course, take for example the Citrix Netscaler VPX, which, last I checked required essentially dedicated network ports in vSphere in order to work.

Myself I’m still not sold on the CNA concept at this point. I’m perfectly happy to run a couple FC switches per chassis, and a few extra cables to run to the storage system.

Comments (2)

November 15, 2010

ISilon gets taken out by EMC

Filed under: News,Storage — Tags: isilon — Nate @ 9:02 am

Looks like EMC did it after all, buying Isilon for $2.25 Billion. Probably the biggest tech deal for the Seattle area for quite some time.

I haven’t paid too much attention to Isilon recently but it does seem like they have a nice product for the scale out media space, lots of big files and high throughput. Isilon, along with Panasas seem to be unique in tightly integrating the storage controller with the NAS side, while other solutions were purely gateway approaches of one sort or another.

So who’s next?

Comments Off

« Newer Posts — Older Posts »

TechOpsGuys.com Diggin' technology every day

February 6, 2011

February 2, 2011

World shrugs as IPv4 addresses finally exhausted

January 31, 2011

What do I expect from a cloud?

Hosting it yourself

Doing it right

Sample capabilities

Memory Chip Failure (or any server component failure or change)

Disk failure

Storage controller software upgrade

Storage system upgrade (add capacity)

UPS failure (or power strip/PDU failure)

Uh-oh I screwed up the network configuration!

It just works

December 12, 2010

Basic Exanet Architecture

32-bit platform

Back to Dell

December 9, 2010

November 24, 2010

November 16, 2010

November 15, 2010