TechOpsGuys.com Diggin' technology every day

August 23, 2013

The world is ending: overreactions to outages

Filed under: Random Thought — Tags: — Nate @ 9:47 am

Paraphrasing from CNBC yesterday:

OMG!! CARL ICHAN IS TWEETERING ABOUT APPLE — NASDAQ IS DOWN — PEOPLE CAN’T TRADE ON THIS NEWS!!

Let me preface this a bit further and say in the line of work that I am in I have been on the receiving end of so many outages of various types … some of them really nasty lasting hours, even down for multiple days, some involving some big data losses, many had me up for 20-30+ hours straight. Some of the most fun times I’ve had have been during big outages. Finally, some excitement!

My favorite outage that I can recall was one at AT&T about nine years ago. They were doing a massive migration to a new platform to support number portability, among other things. So they asked us to hold transactions in our queues while they were down for ~6-8 hours (the company I was at handled most of the mobile e-commerce for them at the time). So we did. 8 hours passed.. 10… 12… 16.. still down. No ETA .. It wasn’t a huge deal for us for the first day, it became somewhat troublesome by the 3rd day as these queues were in memory and we had hard limits on memory(32-bit). But the folks in the AT&T stores were really hurting as they could not provision any new phones, all new orders had to be done on paper, then input into the computer system later. I forgot how long the outage was total I think around 4 days though. I looked at the whole situation and couldn’t help but laugh. Lots of laughter. 8 hours to 4 days.. thousands of orders being placed via paper, by one of, if not the largest telcos in the world.

So I think I have a better perspective on this sort of thing than those less technical folk who freak out about stuff like the NASDAQ outage yesterday.

Taking NASDAQ specifically it was pretty absurd to see the whole situation unfold yesterday (I worked from home so saw the full thing end to end on CNBC). People coming on air and saying how they were frustrated that NASDAQ wasn’t giving any information as to what was going on, speculation about complications being a public company and an exchange at the same time (and disclosure requirements etc). Then a bigwig comes on, Harvey Pitt, a former SEC chairman and just seems to ream NASDAQ, saying how it’s totally unacceptable that they are down, there should be heavy fines and zero tolerance.

Come on folks – get a grip. It’s a stock exchange. It’s not a 911 system. People aren’t going to die. If your system is so fragile that it can’t survive a few hours of downtime on an exchange and can’t tolerate a little volatility then it’s your system that needs to be fixed.

You don’t have control over the exchange, or the internet peering points between you and them(or your broker and them), there’s so many points of failure that you should have a more robust system, the exchanges I have no doubt are incredibly complex, convoluted and obscure things that are constantly under attack by people trying to get trades through as quickly as possible, like those folks that manipulate the market.

Even the experts seem to be moving too fast, just barely a year ago Knight Capital lost $400 million in a matter of minutes due to a software bug. They later were forced to sell the company. More recently the almighty Goldman Sachs did something similar, last I saw they were hoping they would only lose $100 million as a result of that error.

Slow down, take a break. Things are moving too fast. I see people on CNBC constantly argue that the markets are really important because so many people have 401ks, IRAs etc. But reality doesn’t agree with them. I don’t recall the specific stat but I’ve heard it tossed around a few times something along the lines of 85% of stocks are owned by 5% of the population.

Another stat I’ve heard tossed around is  ~80% of the transactions these exchanges get today are from high frequency traders. So if HFT somehow goes away then these exchanges are in trouble revenue wise.

Those two stats alone tell me a lot about the state of the markets. I’m no financial expert obviously but I have watched CNBC a lot for many years now (going back to at least 2007) on a daily basis (RIP Mark Haines). I am often fascinated by the commentary, and the general absurdities of the market structure in general(I find in general it’s more comedy than anything else). There’s been very little investing going on for a very long time. Really the stock markets in general are outright gambling. Stocks rarely move on fundamentals anymore(not sure when they last did) it’s all buzz, and emotions.

It’s no wonder so many startups aren’t interested in going IPO, and some other big established brands are wanting desperately to go private. To get away from activist investors, and the overall pressures to run your company in a pro-market fashion rather than what’s best for the long term health of the company itself (and thus long term shareholders).

If your that dependent on market liquidity (e.g. the schemes folks like Lehman Brothers were doing rolling over their financials every night) – your doing something very wrong, and you deserve to get burned by it.

These exchanges are closed for upwards of 16 hours a day (and closed weekends and holidays!), there is some limited after hours trading, and some stocks trade on other exchanges as well, but the relative liquidity there is small.

This goes beyond NASDAQ of course and to outages in general. Whether it was the recent Google, or Amazon, or Microsoft, or whatever outages recently or others(I suffered through two yesterday myself that were the result of a 3rd party, one of which was literally minutes apart from when NASDAQ went down..).

So chill out. Fix the problem, don’t rush or you might make a mistake and make things worse. Get it right, try not to let that particular scenario happen in the future.

It’s just a website, it’s just a stock exchange. It’s not a nuclear reactor that is on the verge of melt down.

Breathe. The world isn’t going to end because your site/business happens to be offline for a few hours.

August 21, 2013

More IPv6 funnies…

Filed under: Networking,Random Thought — Tags: — Nate @ 5:56 pm

Random, off topic, boring post but I felt compelled to write it after reading a fairly absurd comment on slashdot from another hard core IPv6 fan.

Internet hippies at it again!

I put the original comments in italics, and the non italic stuff is the IPv6 person responding. I mean honestly I can’t help but laugh.

I was a part of the internet when it started and was the wild wild west.  Everyone had nearly unlimited ip addresses and NOBODY used them for several reasons. First nobody put everything on the internet.

That was then. Now is now. The billion people on Facebook, Twitter, Flickr don’t put anything online? Sure, it’s all crap, but it sure is not nothing.

It’s just Dumb to put workstations on the internet… Sally in accounting does not need a public IP and all it does is make her computer easier to target and attack. Hiding behind that router on a separate private network is far more secure. Plus it is easier to defend a single point of entry than it is to defend a 255.255.0.0 address space from the world.

Bullsh*t. If in IPv4 your internal network would be 192.168.10.0/24, you can define an IPv6 range for that as well, e.g. 2001:db8:1234:10::/72. And then you put in your firewall:

2001:db8:1234:10::/72 Inbound: DENY ALL

Done. Hard? No. Harder than IPv4? No. Easier? Yes. Sally needs direct connection to Tom in the other branch (for file transfer, video conference, etc):

2001:db8:1234:10::5411/128 Inbound: ALLOW ALL FROM 2001:db8:1234:11::703/128

Good luck telling your IPv4 CGN ISP you need a port forwarded.

Second I have yet to have someone give me a real need for having everything on the internet with a direct address. you have zero need to have your toaster accessible from the internet.

Oh yeah? Sally might need that 30 GB Powerpoint presentation of her coworker in the other branch. Or that 100 MB customer database. Well, you know, this [xkcd.com]. How much easier would that be with a very simple app that even you could hack together that sends a file from one IP address to the other. Simple and fast, with IPv6. Try it with IPv4.

It’s amazing to me how folks like this think that everything should just be directly connected to the internet. Apparently this IPv6 person hasn’t heard of a file server before, or a site to site VPN. Even with direct accessibility I would want to enforce VPN between the sites, if nothing else to not have to worry that any communications would not be encrypted (or in some cases WAN optimized). Same goes for remote workers – if your at a remote location and wanting to talk to a computer on the corporate LAN or data center – get on VPN. I don’t care if you have a direct route to it or not (in fact I would ensure you did not so you have no choice).

The problems this person cites have been solved for over a decade.

I’m sorry but anyone that argues that 2001:db8:1234:10::5411/128 is simpler than 192.168.10.0/24 is simpler is just …not all there.

The solutions perhaps may not be as clean as something more native, though the thought of someone wanting to move 30GB of data over anyone’s internet connection at the office would be a very bad thing to do without arranging something with IT first (do it off hours, throttle it, something).

The point is the solutions exist, and they work. Fact remains that if you go native IPv6 your going to have MUCH MORE PAIN than any of the hacks that you may have to do with IPv4 today. IPv6 fans fail to acknowledge that up front. They attack IPv4/NAT/etc and just want the world to turn the switch off of IPv4 and flip everyone over.

I have said for years I don’t look forward to IPv6 myself (mainly for the numbering scheme, it sucks hard). If the time comes where I need IPv6 for myself or the organization I work for there are other means to get it (e.g. NAT – at the load balancer level in my case) that will work for years to come (until perhaps there is some sort of mission critical mass of outbound IPv6 connectivity that I need – I don’t see that in the next 5-8 years – beyond that who knows maybe I won’t be doing networking anymore so won’t care).

I’m sure people like me are the kind of folks IPv6 people hate. I don’t blame ’em I suppose.

There is nothing – absolutely nothing that bugs me about IPv4 today. Not a damn thing hinders me or the organizations I have worked for. At one point SSL virtual hosting was an issue, but even that is solved with SNI (which I just started using fairly recently actually).

The only possibility of having an issue I think is perhaps if my organization merged with another and there was some overlapping IP space. Haven’t personally encountered that problem though in a very long time (9 years – and even then we just setup a bunch of 1:1 NATs I think – I wasn’t the network engineer at the time so wasn’t my problem).

I remember one company I worked for 13 years ago – they registered their own /24 network back in the early 90s, because the people at the time believed they had to in order to run an internal network. The IP space never got used (to my knowledge) and it was just lingering around – the contact info was out of date and we didn’t have any access to it (not that we needed it, was more a funny story to tell).

When I set this server up at Hurricane Electric, one of the things they asked me was if I wanted IPv6 connectivity, since they do it natively I believe (one of the biggest IPv6 providers out there I think globally ?). I thought about it for a few seconds and declined, don’t need it.

IPv6 fans need to come up with better justification for the world to switch other than “the internet is peer to peer and everyone needs a unique address” (because that reason doesn’t cut it for folks like me, and given the world’s glacial pace of migration I think my view is the norm rather than the exception). I’ve never really cared about peer to peer anything. The internet in general has been client-server and will likely remain so for some time (especially given the average gap between download and upload bandwidth on your typical broadband connection)

Given I have a server with ~3.6TB of usable space on a 100Mbps unlimited bandwidth connection less than 25 milliseconds from my home I’d trade download bandwidth for upload bandwidth in a HEARTBEAT – I’d love to be able to get something like 25/25Mbps unfortunately the best upload i can get is 5Mbps – while I can get 150Mbps down — my current plan is more like 2Mbps up and 16Mbps down.

Speedtest.net results for this server. I had to try several different test servers before I found one that was fast enough to handle me.

Speedtest.net results for this server. I had to try several different test servers before I found one that was fast enough to handle me.

ANYWAY…….. I had a good laugh at least.

Back to your regularly scheduled programming..

August 17, 2013

Happy Birthday Debian: 20 years old

Filed under: linux — Tags: — Nate @ 4:10 pm
Debian Powered

Techopsguys is Debian Powered

The big 2-0. Debian was the 2nd Linux I cut my teeth on, the first being Slackware 3.x. I switched to Debian 2.0 (hamm) in 1998 when it first came out. This was before apt existed (I think that was Debian 2.2 but not sure). I still remember the torture that was dselect, and much to my own horror dselect apparently still lives. Though I had to apt-get install it. It was torture because I literally spent 4-6 hours going through the packages selecting them one at a time. There may of been an easier way to do it back then I’m not sure, I was still new to the system.

I have been with Debian ever since, hard to believe it’s been about 15 years since I first installed it. I have, with only one exception stuck to stable the entire time. The exception I think was in between 2.2 and 3.0, I think that delay was quite large so I spent some time on the testing distribution. Unlike my early days running Linux I no longer care about the bleeding edge. Perhaps because the bleeding edge isn’t as important as it once was(to get basic functionality out of the system for example).

Debian has never failed me during a software update, or even major software upgrade. Some of the upgrades were painful (not Debian’s fault – for example going from Cyrus IMAP 1.x to 2.x was really painful). I do not have any systems that have lasted long enough to traverse more than one or two major system upgrades, hardware always gets retired. But unlike some other distributions major upgrades were fully supported and worked quite well.

I intentionally avoided Red Hat in my early days specifically because it was deemed easier to use. I started with Slackware, and then Debian. I spent hours compiling things whether it was X11, KDE 0.x, QT, GTK, Gnome, GIMP.. I built my own kernels from source, even with some custom patches(haven’t seriously done this since Linux 2.2). I learned a lot, I guess you could say the hard way. Which is why in part I do struggle on advising people who want to learn Linux what the best way is(books, training etc). I don’t know since I did it another way, a way that takes many years. Most people don’t have that kind of patience. At the time of course I really didn’t realize those skills would become so valuable later in life it was more of a personal challenge for myself I suppose.

I have used a few variants/forks of Debian over the years, most recently of course being Ubuntu. I have used Ubuntu exclusively on my laptops going back several years(perhaps even to 2006 I don’t remember). I have supported Ubuntu in server environments for the past roughly three years. I mainly chose Ubuntu for the laptops and desktops for the obvious reason – hardware compatibility. Debian (stable) of course tends to lag behind hardware support. Though these days I’m still happy running Ubuntu 10.04 LTS desktop .. which is EOL now. Haven’t decided what my next move is, not really thinking about it since what I have works fine still. Probably think more whenever I get my next hardware refresh.

I also briefly used Corel Linux, of which I still have the inflatable Corel penguin sitting on my desk at work it has followed me to every job for the past 13 years, still keeps it’s air. I don’t know why I have kept it for so long. Corel Linux was interesting in that they ported some of their own windows apps over to Linux with Wine, their office suite and some graphics programs. They made a custom KDE file manager if I recall right(with built in CIFS/SMB support if I recall right). Other than that it wasn’t much to write home about. Like most things on Linux the desktop apps were very fragile, obviously closed source and so did not last long(compatibility wise could not run them on other systems) after Corel Linux folded. My early Debian systems that I used as desktops at least got butchered by me installing custom stuff on top of them. Linux works best when you stick with the OS packages, and that’s something I did not do in the early days. These days I go to semi extreme lengths to make sure everything (within my abilities) is packaged in a Debian package before installation.

I used to participate a lot in the debian-user mailing list eons ago, though haven’t since due to lack of time. At the time at least that list had massive volume, it was just insane the amount of email I got from it. Looking now, comparing August 2013 roughly 1,300 messages, vs August 2001 almost 6,000! Even more so the spam I got long after I unsubscribed. It persisted for years until I terminated the email address associated with that list. I credit one job offer a bit over ten years ago now to my participation on that(and other) mailing lists at the time, as I specifically called them out in my references.

That being said, despite my devotion to Debian on my home systems (servers at least, this blog runs on Debian 7), I still do prefer Red Hat for commercial/larger scale stuff. Even with the past three years supporting Ubuntu the experience has been ok, I still like RH more. At the same time I do not like RH for my own personal use. It basically comes down to how the system is managed. I was going to go into reasons why I like RH more for this or that, but decided not to since it is off topic for this post.

I’ve never seen Toy Story – the movie characters Debian has used to name it’s releases after since at least 2.0 perhaps longer. Not really my kind of flick, have no intention of ever seeing it really.

Here’s a really old screen shot from my system back in the day. I don’t remember if this is Slackware or Debian, the kernel being compiled 2.1.121 came out in September 1998, so right about the time I made the switch. Looks like I am compiling Gimp 1.01, some version of XFree86, and downloading a KDE snapshot (I think all of that was pre 1.0 KDE). And look, xfishtank in the background! I miss that. These days Gnome and KDE take over the root window making things like xfishtank not visible when using them (last I tried at least). xpenguins is another cool one that does still work with GNOME.

REALLY Old Screenshot

So, happy 20th birthday Debian, it has been interesting to watch you grow up, and it’s nice to see your still going strong.

Yahoo! yanks! website! of! person! who! documented! his! suicide!

Filed under: Random Thought — Tags: — Nate @ 3:39 pm

The only thing technical related to this is the fact that Yahoo! yanked the guy’s site. I suppose I can understand why, but I am very glad that the site(at least for the moment) lives on at a mirror.

I just came across this on slashdot. The discussion wasn’t all that interesting but I have been reading a mirror of the website, and have to say it is quite an amazing write up.

I felt this guy took the time to think and write about his thoughts and did the world a favor in showing his state of mind. So the least I could do is read it – and perhaps comment on it a bit (with whatever respect I can give).

From what I can tell he committed suicide because he felt his mind was going, he was no longer (as) productive to society as he wanted to be, and he had a very negative outlook on near term civilization as we know it.

He took his life two days ago, on his 60th birthday in a police parking lot with a self inflicted gunshot wound to the head.

I have no idea who Martin Manley was but it was very interesting to see his line of thinking.

Some good quotes

I began seeing the problems that come with aging some time ago. I was sick of leaving the garage door open overnight. I was sick of forgetting to zip up when I put on my pants. I was sick of forgetting the names of my best friends. I was sick of going downstairs and having no idea why. I was sick of watching a movie, going to my account on IMDB to type up a review and realizing I’ve already seen it and, worse, already written a review! I was sick of having to dig through the trash to find an envelope that was sent to me so I could remember my own address – especially since I lived in the same place for the last nine years!

[..]

I didn’t want to die alone. I didn’t want to die of old age. I didn’t want to die after years of unproductivity. I didn’t want to die having my chin and my butt wiped by someone who might forget which cloth they used for which. I didn’t want to die of a stroke or cancer or heart attack or Alzheimer’s. I decided I was gettin’ out while the gettin’ was good and while I could still produce this website!

He does mention a life insurance policy that expires next year, and he wouldn’t of been able to afford to renew it.  So that money can go to the folks he cares about.  Though I thought most, if not all such policies excluded suicide. I am not sure though, never looked into it. He seems intelligent enough that he would of known the details of the policy he had.

I felt pretty good about being prepared for economic collapse – the primary reason being all the gold and silver I owned. But, then one day I realized that all the gold and silver and guns and ammo and dried food and toilet paper in the world wouldn’t prevent me from seeing the calamity with my own eyes – either ignoring other’s plight or succumbing to it. And, that’s something I decided I simply was not willing to live through.

Right with him there, except for the fact about being prepared. I acknowledged a long time ago that there’s no point in trying to prepare for such an event, the resources required would be pretty enormous. My best friend(who is reading this, HI!!!)  has told me on a couple of occasions to go live with him in a cabin in the woods, live off the land(in the event of total collapse).. Not feasible for various reasons I don’t want to get into here.

But, if you plan to stick around, then you better plan to watch an economic collapse that will be worse than anything you can imagine.

It’s frustrating to me to see all of our leaders, whether it is in corporations or government show such, I’m not sure what the words are other than to call it something like false confidence.  Hiding the truth because sentiment is such an important factor of the economy. It’s everywhere, the more I see folks talk the more I see in most cases they really have no idea what they are doing, they are just hoping it works out.

The more I learn the more I realize how young of a civilization we really are and how little we actually know.

What pisses me off more than anything is the system in place that tries to educate us so we think we know. So we have faith in those that are making the decisions.

I’m the first one to admit I don’t know what the answers are(macro global economic/political type things) — but I’m also (one of)the first ones to admit that uncertainty in the first place, which would probably make me a bad leader. I can’t portray confidence because I don’t have confidence(in that stuff anyway, I believe I do portray confidence when it comes to the tech things I work with), I do have honesty, which is what this Martin fellow seems to have a lot of as well. Most people don’t want the truth, they just want to feel good.

One of the only videos I ever uploaded to Youtube was this, which is a good illustration from the corporate side of things. There is another bit (haven’t been able to find it last I checked) which showed the same sort of thing from our previous president where they walked through his various descriptions of the impending economic downfall. From storm clouds to whatever it was in the end.

I’m in complete agreement with this Martin guy though, what we experienced in 2008-2010 or whatever is nothing compared to what is coming. When that is exactly I’m not sure, it seems folks like the Fed etc seem to be able to pull rabbits out of their hats to drove the ponzi scheme just a bit further. My general expectation is within the next two decades, and I think that is probably a conservative estimate.

It is unfortunate that the topic of suicide is amongst those topics that are considered taboo. People don’t talk about it. The common theme is often mention the word and your deemed crazy and they want to lock you up in a padded room, fill you with meds until you conform, or die in the process.

There was a news report on NBC that I saw last year about a facility(hospital) where they assist people and their families to prepare for when that day will come. People perhaps like Martin who don’t want to live out their lives as a burden to others, unable to mentally and/or physically perform things. I thought it was pretty amazing to see. They go through tons of questions with the patient about scenarios and what do do with those scenarios. So when the time comes there is no doubt. There won’t be some random family member saying KEEP THEM ALIVE I DON’T CARE IF IT COSTS A MILLION $.

As with many topics, there was (in retrospect, it really didn’t hit me until a few years ago) an excellent Star Trek: The Next Generation episode on this very topic. It was called Half a Life from Season 4 (1991). At the time when I saw it, I suppose you could say I didn’t understand it, as I found the episode quite boring (well the special effects in the beginning were pretty neat). But as the health care debate picked up a few years ago I realized that episode told an interesting story that I never bothered to realize until that time.

Martin lists reasons he considered not to commit suicide (paraphrasing(?) them briefly, see the site for more details):

  1. Loved ones – obvious, people will miss you. More importantly perhaps is if you are in a situation where you are supporting someone else and they are dependent upon you. Martin was not in this situation
  2. Want to see the future. Live out retirement, travel the world perhaps, read books, play backgammon, look at granny porn (my grandfather did this a lot in the years before he died, honestly I did not know such porn existed until my sister+mother told me)
  3. People want to accomplish as much as they can in their lives and they don’t want to run out of time before they do it. Of course, for people who think that way, they never fulfill all those accomplishments anyway and they never will. So, the only thing to do is keep chasing them until you die.
  4. The last major reason I thought of for why people want to live indefinitely is the whole notion of leaving a legacy.

More quotes..

I once had a quasi bucket list when I was about 22 – things to accomplish by the time I was 30. When 30 came around and I hadn’t accomplish them, I decided the bucket list idea was stupid.

[..]

There will always be reasons to want to stay alive another year or five years or 10 years. It wouldn’t have mattered how long I lived, there would have been hundreds or thousands of itches to scratch!

[..]

I could take pride in the fact that I wasn’t going to be sucking on the nipple of the federal debt by taking social security and medicare. When the US economy collapses, it won’t have been me that contributed to taking it down.

Here he touches on life insurance again, I guess they do pay out for suicide –

Another reason why age 60 is ideal is that my life insurance expires next year and I would not be able to afford to get new insurance without paying a ton. And, it requires two years of waiting – once you get insurance – before you can commit suicide and still have the beneficiaries receive the death benefit.

Holy sh*t, he even brings up the aforementioned Half a Life episode of TNG. He devotes several paragraphs to it!

Then he goes into how he did it – a gun

One of the problems with shooting oneself is the obvious mess. I thought about that a lot. I didn’t want anyone I knew discovering my body and I didn’t want to make a mess in the house – something my sister or my landlord would have to deal with. No way.

[..]

I finally decided the best way to do it would be at 5AM on August 15, 2013 at the far southeast end of the parking lot at the Overland Park Police Station. If everything worked out right – and I’m sure it did, I called 911 at 5AM. I told them “I want to report a suicide at the south end of the parking lot of the Overland Park Police Station at 123rd and Metcalf. Bang.”

He left a note on himself which in part read

“I committed suicide of my own free will. I am not under the influence of any drugs. I am sorry for your inconvenience! You will be contacted within a matter of hours by my sister. She will find out about this by an overnight letter and/or email I sent to her which she will get this morning. In it, I explain the exact location where I shot myself and gave her your phone number. At that time, she will tell you who I am. If you discover who I am prior to her call, please do not contact her. I do not want her (or anyone else I sent letters to overnight) to find out about it from you. I want them to find out about it from me. Thank you!

Another quote

The act of suicide can be horrible for those left behind. I couldn’t control the fact of the matter, but I could control the circumstances. I believe the way I did it, coupled with the overnight letters/emails and this web site, is the best I can do to mitigate the hurt.

He wrote a bunch more but the most interesting stuff I suspect was in the first few pages that I read (up until the “Gun Control” topic which is only semi related).

It is fascinating to me to see the level of thought and analysis that went into his decision. Whether you agree with the decision or not is up to you, but to me I think the point is it was his decision. Call him greedy, or selfish or crazy or whatever, but I firmly believe that having the “freedom” (though I think it is illegal(hence the quotes), which is crazy) to make this decision and follow through on it is an important right to have.

Now if you cause major harm as a result of your action (say perhaps drive down the wrong direction on a freeway) then that’s a different scenario.  Martin was incredibly thoughtful in how he handled the whole situation, and for that I ..well I don’t really know how to put that into words.

If you are someone who would want to rob someone else of this right then I’d say it is you who are greedy, selfish or crazy. The only exception would be again if you were somehow economically dependent upon them.

While I understand Yahoo! decision to yank the site, as it probably was against their TOS – I only wish Martin would of chosen a better place to host the site! Fortunately there is a mirror, I’ll probably snag a copy of it myself just in case.

I don’t consider myself a very emotional person, reading his writing really did not evoke any emotional response. I went to my first(thus far, only) funeral almost two years ago now for a cousin of mine who also committed suicide(also via self inflicted gunshot). I hadn’t seen him since the late 80s, I had no emotional response for that either. I did feel bad for not “feeling” bad though, as strange as that may sound to write. I sat through the funeral as his loved ones and friends told stories about him and stuff for a few hours. It was an interesting experience. I’m sorry he is gone but from what I knew of the broader situations the family had experienced over the past few decades I can totally understand his decision. Anyway that is sort of off topic.

I hope the folks that where closest to Martin understand, accept, and most importantly support his decision.

I suppose the obvious thing to say here is may he rest in peace.

Any further discussion on the topic I am more than happy to talk about off line.

August 10, 2013

The Myth of online backup and the future of my mobility

Filed under: Random Thought — Tags: , , , , — Nate @ 12:52 pm

I came across this article on LinkedIn which I found very interesting. The scenario given by the article was a professional photographer had 500GB of data to backup and they decided to try Carbonite to do it.

The problem was Carbonite apparently imposes significant throttling on the users uploading large amounts of data –

[..]At that rate, it takes nearly two months just to upload the first 200GB of data, and then another 300 days to finish uploading the remaining 300GB.

Which takes me back to a conversation I was having with my boss earlier in the week about why I decided to buy my own server and put it in a co-location facility, instead of using some sort of hosted thing.

I have been hosting my own websites, email etc since about 1996. At one point I was hosted on T1s at an office building, then I moved things to my business class DSL at home for a few years, then when that was no longer feasible I got a used server and put it up at a local colo in Seattle. Then I decided to retire that old server(build in 2004) and spent about a year in the Terremark vCloud, before buying a new server and putting it up at a colo in the Bay area where I live now.

My time in the Terremark cloud was OK, my needs were pretty minimal, but I didn’t have a lot of flexibility(due to the costs). My bill was around $120/mo or something like that for a pair of VMs. Terremark operates in a Tier 4 facility and doesn’t use the built to fail model I hate so much, so I had confidence things would get fixed if they ever broke, so I was willing to pay some premium for that.

Cloud or self hosting for my needs?

I thought hard about whether or not to invest in a server+colo again or stay on some sort of hosted service. The server I am on today was $2,900 when I bought it, which is a decent amount of money for me to toss around in one transaction.

Then I had the idea of storing data off site, I don’t have much that is critical, mostly media files and stuff that would take a long time to re-build in case of major failure or something. But I wanted something that could do at least 2-3TB of storage.

So I started looking into what this would cost in the cloud. I was sort of shocked I guess you could say. The cost for regular, protected cloud storage was going to easily be more than $200/mo for 3TB of usable space.

Then there are backup providers like Carbonite, Mozy, Backblaze etc.. I read a comment on Slashdot I think it was about Backblaze and was pretty surprised to then read their fine print

Your external hard drives need to be connected to your computer and scanned by Backblaze at least once every 30 days in order to keep them backed up.

So the data must be scanned at least once every 30 days or it gets nuked.

They also don’t support backing up network drives. Most of the providers of course don’t support Linux either.

The terms do make sense to me, I mean it costs $$ to run, and they advertise unlimited. So I don’t expect them to be storing TBs of data for only $4/mo. It just would be nice if they (and others) would be more clear on their limitations up front, at least unlike the person in the article above I was able to make a more informed decision.

The only real choice: Host it myself

So the decision was really simple at that point. Go invest and do it myself. It’s sort of ironic if you think about it, all this talk about cloud saving people money. Here I am, just one person, with no purchasing power whatsoever and I am saving more money doing it myself then some massive scale service provider can offer it.

The point wasn’t just the storage though. I wanted something to host:

  • This blog
  • My email
  • DNS
  • my other websites / data
  • would be nice if there was a place to experiment/play as well

So I bought this server which is a single socket quad core Intel chip, originally with 8GB, now it has 16GB of memory, and 4x2TB SAS disks in RAID 1+0(~3.6TB usable) w/3Ware hardware RAID controller(I’ve been using 3Ware since 2001). It has dual power supplies(though both are connected to the same source, my colo config doesn’t offer redundant power). It even has out of band management with full video KVM and virtual media options. Nothing like the quality of HP iLO, but far better than what a system of this price point could offer going back a few years ago.

On top of that I am currently running 5 active VMs

  • VM #1 runs my personal email, DNS,websites, this blog etc
  • VM #2 runs email for a few friends, and former paying customers(not sure how many are left) from an ISP that we used to run many years ago, DNS, websites etc
  • VM #3 is a OpenBSD firewall running in layer 3 mode, also provides site to site VPN to my home, as well as a end-user VPN for my laptop when I’m on the road)
  • VM #4 acts as a storage/backup server for my home data with a ~2TB file system
  • VM #5 is a windows VM in case I need one of those remotely. It doesn’t get much use.
  • VM #6 is the former personal email/dns/website server that ran a 32-bit OS. Keeping it around on an internal IP for a while in case I come across more files that I forgot to transfer.

There is an internal and an external network on the server, the site to site VPN of course provides unrestricted access to the internal network from remote which is handy since I don’t have to rely on external IPs to run additional things. The firewall also does NAT for devices that are not on external IPs.

Obviously as you might expect the server sits at low CPU usage 99% of the time and it’s running at around 9GB of used memory, so I can toss on more VMs if needed. It’s obviously a very flexible configuration.

When I got the server originally I decided to host it with the company I bought it from,  and they charged me $100/mo to do it. Unlimited bandwidth etc.. good deal(also free on site support)!  First thing I did was take the server home and copy 2TB of data onto it. Then I gave it back to them and they hosted it for a year for me.

Then they gave me the news they were going to terminate their hosting and I had only two weeks to get out. I evaluated my options and decided to stay at the same facility but started doing business with the facility itself (Hurricane Electric). The down side was the cost was doubling to $200/mo for the same service (100Mbit unlimited w/5 static IPs), since I was no longer sharing the service with anyone else. I did get a 3rd of a rack though, not that I can use much of it due to power constraints(I think I only get something like 200W). But in the grand scheme of things it is a good deal, I mean it’s a bit more than double what I was paying in the Seattle area but I am getting literally 100 times the bandwidth. That gives me a lot of opportunities to do things. I’ve yet to do much with it beyond my original needs, that may change soon though.

Now granted it’s not high availability, I don’t have 3PAR storage like Terremark did when I was a customer, I have only 1 server so if it’s down everything is down.  It’s been reliable though, providing really good uptime over the past couple of years. I have had to replace at least two disks, and I also had to replace the USB stick that runs vSphere the previous one seemed to have run out of flash blocks as I could no longer write much to the file system. That was a sizable outage for me as I took the time to install vSphere 5.1 (from 4.x) on the new USB stick, re-configure things as well as upgrade the memory all in one day, took probably 4-5 hours I think. I’m connected to a really fast backbone and the network has been very reliable (not perfect, but easily good enough).

So my server was $2,900, and I pay currently $2,400/year for service. It’s certainly not cheap, but I think it’s a good deal still relative to other options. I maintain a very high level of control, I can store a lot of data, I can repair the system if it breaks down, and the solution is very flexible, I can do a lot of things with the virtualization as well as the underlying storage and the high bandwidth I have available to me.

Which brings me to next steps, something I’ve always wanted to do is make the data more mobile, that is one area which it was difficult(or impossible) to compete with cloud services, especially on things like phones and tablets. Since they have the software R&D to make those “apps” and other things.

I have been using WebOS for several years now, which of course runs on top of Linux. Though the underlying Linux OS is really too minimal to be of any use to me. It’s especially useless on the phone where I am just saddened that there has never been a decent terminal emulation app released for WebOS. Of all the things that could be done, that one seems really trivial. But it never happened(that I could see, there were a few attempts but nothing usable as far as I could tell). On the touchpad things were a little different, you could get an Xterm and it was kind of usable, significantly more so than the phone. But still the large overhead of X11 just to get a terminal seemed quite wasteful. I never really used it very much.

So I have this server, and all this data sitting on a fast connection but I didn’t have a good way to get to it remotely unless I was on my laptop (except for the obvious like the blog etc are web accessible).

Time to switch to new mobile platform

WebOS is obviously dead(RIP), in the early days post termination of the hardware unit at HP I was holding out some hope for the software end of things but that hope has more or less dropped to 0 now, nothing remains but disappointment of what could of been. I think LG acquiring the WebOS team was a mistake and even though they’ve announced a WebOS-powered TV to come out early next year, honestly I’ll be shocked if it hits the market. It just doesn’t make any sense to me to run WebOS on a TV outside of having a strong ecosystem of other WebOS devices that you can integrate with.

So as reality continued to set in, I decided to think about alternatives, what was going to be my next mobile platform. I don’t trust Google, don’t like Apple. There’s Blackberry and Windows Phone as the other major brands in the market. I really haven’t spent any time on any of those devices. So I suppose I won’t know for sure but I did feel that Samsung had been releasing some pretty decent hardware + software (based on stuff I have read only), and they obviously have good market presence.  Some folks complain etc.. If I were to go to a Samsung Android platform I probably wouldn’t have an issue. Those complaining about their platform probably don’t understand the depression that WebOS has been in since about 6 months after it was released – so really anything relative to that is a step up.

I mean I can’t even read my personal email on my WebOS device without using the browser. Using webmail via the browser on WebOS for me at least is a last resort thing, I don’t do it often(because it’s really painful – I bought some skins for the webmail app I use that are mobile optimized only to find they are not compatible with WebOS so when on WebOS I use a basic html web mail app, it gets the job done but..). The reason I can’t use the native email client is I suppose in part my fault, the way I have my personal email configured is I have probably 200 email addresses and many of them go directly to different inboxes. I use Cyrus IMAP and my main account subscribes to these inboxes on demand. If I don’t care about that email address I unsubscribe and it continues to get email in the background. WebOS doesn’t support accessing folders via IMAP outside of the INBOX structure of a single account. So I’m basically SOL for accessing the bulk of my email (which doesn’t go to my main INBOX). I have no idea if Samsung or Android works any different.

The browser on the touchpad is old and slow enough that I keep javascript disabled on it, I mean it’s just a sad decrepit state for WebOS these days(and has been for almost two years now). My patience really started running out recently when loading a 2-page PDF on my HP Pre3, then having the PDF reader constantly freeze (unable to flip between pages, though the page it was on was still very usable) if I let it sit idle for more than a couple of minutes (have to restart the app).  This was nothing big, just a 2-page PDF the phone couldn’t even handle that.

I suppose my personal favorite problem is not being able to use bluetooth and 2.4Ghz wifi at the same time on my phone. The radios conflict, resulting in really poor quality over bluetooth or wifi or both. So wifi stays disabled the bulk of the time on my phone since most hotspots seem only to do 2.4Ghz, and I use bluetooth almost exclusively when I make voice calls.

There are tons of other pain points for me on WebOS, and I know they will never get fixed, those are just a couple of examples. WebOS is nice in other ways of course, I love the Touchstone (inductive charging) technology for example, the cards multitasking interface is great too(though I don’t do heavy multi tasking).

So I decided to move on. I was thinking Android, I don’t trust Google but, ugh, it is Linux based and I am a Linux user(I do have some Windows too but my main systems desktops, laptops are all Linux) and I believe Windows Phone and BlackBerry would likely(no, certainly) not play as well with Linux as Android. (WebOS plays very well with Linux, just plug it in and it becomes a USB drive, no restrictions – rooting WebOS is as simple as typing a code into the device). There are a few other mobile Linux platforms out there, I think Meego(?) might be the biggest trying to make a come back, then there is FirefoxOS and Ubuntu phone.. all of which feel less viable(in today’s market) than WebOS did back in 2009 to me.

So I started thinking more about leaving WebOS, and I think the platform I will go to will be the Samsung Galaxy Note 3, some point after it comes out(I have read ~9/4 for the announcement or something like that). It’s much bigger than the Pre3, not too much heavier(Note 2 is ~30g heavier). Obviously no dedicated keyboard, I think the larger screen will do well for typing with my big hands. The Samsung multimedia / multi tasking stuff sounds interesting(ability to run two apps at once, at least Samsung apps).

I do trust Samsung more than Google, mainly because Samsung wants my $$ for their hardware. Google wants my information for whatever it is they do..

I’m more than willing to trade money in a vein attempt to maintain some sort of privacy. In fact I do it all the time, I suppose that could be why I don’t get much spam to my home address(snail mail). I also very rarely get phone calls from marketers(low single digits per year I think), even though I have never signed up to any do not call lists(I don’t trust those lists).

Then I came across this comment on Slashdot –

Well I can counter your anecdote with one of my own. I bought my Galaxy S3 because of the Samsung features. I love multi-window, local SyncML over USB or WiFi so my contacts and calendar don’t go through the “cloud”, Kies Air for accessing phone data through the browser, the Samsung image gallery application, the ability to easily upgrade/downgrade/crossgrade and even load “frankenfirmware” using Odin3, etc. I never sign in to any Google services from my phone – I’ve made a point of not entering a Google login or password once.

So, obviously, I was very excited to read that.

Next up, and this is where the story comes back around to online backup, cloud, my co-lo, etc.. I didn’t expect the post to be this long but it sort of got away from me again..

I think it was on another Slashdot comment thread actually (I read slashdot every day but never have had an account and I think I’ve only commented maybe 3 times since the late 90s), where someone mentioned the software package Owncloud.

Just looking at the features, once again got me excited. They also have Android and IOS apps. So this would, in theory, from a mobile perspective allow me to access files, sync contacts, music, video, perhaps even calendar(not that I use one outside of work which is Exchange) and keep control over all of it myself. Also there are desktop sync clients (ala dropbox or something like that??) for Linux, Mac, and Windows.

So I installed it on my server, it was pretty easy to setup, I pointed it to my 2TB of data and off I went. I installed the desktop sync client on several systems(Ubuntu 10.04 was the most painful to install to, had to compile several packages from source but it’s nothing I haven’t done a million times before on Linux). The sync works well (had to remove the default sync which was to sync everything, at first it was trying to sync the full 2TB of data, and it kept failing, not that I wanted to sync that much…I configured new sync directives for specific folders).

So that’s where I’m at now. Still on WebOS, waiting to see what comes of the new Note 3 phone, I believe I saw for the Note 2 there was even a custom back cover which allowed for inductive charging as well.

It’s sad to think of the $$ I dumped on WebOS hardware in the period of panic following the termination of the hardware division, I try not to think about it ….. The touchpads do make excellent digitial picture frames especially when combined with a touchstone charger.  I still use one of my touchpads daily(I have 3), and my phone of course daily as well. Though my data usage is quite small on the phone since there really isn’t a whole lot I can do on it, unless I’m traveling and using it as a mobile hot spot.

whew, that was a lot of writing.

August 8, 2013

Nth Symposium 2013: HP Bladesystem vs Cisco UCS

Filed under: General — Tags: , , , — Nate @ 11:00 pm

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

I can feel the flames I might get for this post but I’m going to write about it anyway because I found it interesting. I have written about Cisco UCS in the past(very limited topics), have never been impressed with it, and really at the end of the day I can’t buy Cisco on principle alone – doesn’t matter if it was $1, I can’t do it (in part because I know that $1 cost would come by screwing over many other customers to make that price possible for me).

Cisco has gained a lot of ground in the blade market since they came out with this system a few years ago and I think they are in 3rd place, maybe getting close to 2nd (last I saw 2nd was a very distant position behind HP).

So one of the keynotes (I guess you can call it that? it was on the main stage) was someone from HP who says they recently re-joined HP earlier in the year(or perhaps last year) after spending a couple of years at Cisco both selling and training their partners on how to sell UCS to customers. So obviously that was interesting to me, hearing this person’s perspective on the platform. There was a separate break-out session on this topic that went into more detail but it was NDA-only so I didn’t attend.

I suppose what was most striking is HP going out of their way to compare themselves against UCS, that says a lot right there. They never mentioned Dell or IBM stuff, just Cisco. So Cisco obviously has gotten some good traction (as sick as that makes me feel).

Out of band management

HP claims that Cisco has no out of band management on UCS, there are primary and backup data paths but if those are down then you are SOL. HP obviously has (optionally) redundant out of band management on their blade system.

I love out of band management myself, especially full lights out. My own HP VMware servers have dedicated in-band(1GbE) as well as the typical iLO out of band management interfaces. This is on top of the 4x10GbE and 2x4Gbps FC for storage. Lots of connectivity. When I was having issues with our Qlogic 10GbE NICs last year this came in handy.

Fault domains

This can be a minor issue – mainly an implementation one. Cisco apparently allows UCS to have a fault domain of up to 160 servers, vs HP is 16(one chassis). So you can, of course, lower your fault domain on UCS if you think about this aspect of things — how many customers realize this and actually do something about it? I don’t know.

HP Smart Update Manager

I found this segment quite interesting. HP touts their end to end updates mechanism which includes:

  • Patch sequencing
  • Driver + Firmware management
  • Unified service pack (1 per quarter)

HP claims Cisco has none of these, they cannot sequence patches, their management system does not manage drivers (it does manage firmware), and the service packs are not unified.

At this point the HP person pointed out a situation a customer faced recently where they used the UCS firmware update system to update the firmware on their platform. They then rebooted their ESX systems(I guess for the firmware to take effect), and the systems could no longer see the storage. It took the customer on the line with Cisco, VMware, and the storage company 20 hours until they figured out the problem was the drivers were out of sync with the firmware which was the reason for the downtime.

I recall a few years ago another ~20 hour outage on a Cisco UCS platform at a sizable company in Seattle for similar reasons, I don’t know why in both cases it took so long to resolve, in the Seattle case there was a firmware bug (known bug) that was causing link flapping and as a result massive outage because I believe storage was not very forgiving to that. Fortunately Cisco had a patch but it took em ~20 hours of hard downtime to figure out the problem.

I’m sure there are similar stories for the HP end of things too… I have heard of some nasty issues with flex fabric and virtual connect.  There is one feature I like about flexfabric and virtual connect, that is the chassis-based MAC/WWN assignments. Everything else they can keep. I don’t care about converged ethernet, I don’t care about reducing my cable count(having a few extra fibre cables for storage per chassis really is nothing)…

Myself the only outages I have had that have lasted that long have been because of application stack failures, I think the longest infrastructure related outage I’ve been involved with in the past 15 years was roughly six, maybe eight hours.  I have had outages where it took longer than 20 hours to recover fully from – but the bulk of that time the system was running we just had recovery steps to perform. Never had a 20 hour outage where 15 hours into the thing nobody has any idea what is the problem or how to fix it.

Longest outage ever though was probably ~48-72 hours – and that was entirely application stack failure. That was the time we got all the senior software developers and architects in a room and asked them How do we fix this? and they gave us blank stares and said We don’t know, it’s not supposed to do this.  Not a good situation to be in!

Anyway, back on topic.

HP says since December 2011 they have released 9 critical updates, and Cisco have released 38 critical updates.

The case for intelligent compute

I learned quite a bit from this segment as well. Back in 2003 the company I was at was using HP and Compaq gear, it ran well though obviously was pretty expensive. Everything was DL360s, some DL380s, some DL580s. When it came time to do a big data center refresh we wanted to use SATA disks to cut some costs, so we ended up going with a white box company instead of HP (this was before HP had the DL100 series). I learned a lot from that experience, and was very happy to return to HP as a customer at my next company(though I certainly realize given the right workload HP’s premium may not be worth it – but for highly consolidated virtualized stuff I really don’t want to use anything else). The biggest issue I had with white box stuff was bad ram. It seemed to be everywhere. Not long after we started deployment I started using the Cerberus Test Suite to burn in our systems which caught a lot of it. Cerberus is awesome if you haven’t tried it. I even used it on our HP gear mainly to drive CPU and memory to 100% usage to burn them in (no issues found).

HP Advanced ECC Outcomes

HP Advanced ECC Outcomes

HP has a technology called Advanced ECC, which they’ve had since I believe 1996, and is standard on at least all 300-series servers and up. 10 years ago when our servers rarely had more than 2GB of memory in them(I don’t think we went 64-bit until at least 2005), Advanced ECC wasn’t a huge deal, 2GB of memory is not much. Today, with my servers having 384GB ..I really refuse to run any high memory configuration without something like that. IBM has ChipKill, which is similar. Dell has nothing in this space. Not sure about Cisco(betting they don’t, more on that in a moment).

HP's advanced ECC

HP Advanced ECC

HP talked about their massive numbers of sensors with some systems(I imagine the big ones!) having up to 1,600 sensors in them. (Here is a neat video on Sea of Sensors from one of the engineers who built them – one thing I learned is the C7000 chassis has 104 different fan speeds for maximum efficiency) HP introduced pre failure alerting in 1995, and has had pre failure warranties for a long time (perhaps back to 1995 as well). They obviously have complete hypervisor integration (one thing I wasn’t sure of myself until recently, while upgrading our servers one of the new sticks went bad and an alert popped up in vCenter and I was able to evacuate the host and get the stick replaced without any impact — this failure wasn’t caught by burn-in, just regular processing, I didn’t have enough spare capacity to take out too many systems to dedicate to burn-in at that point).

What does Cisco have? According to HP not much. Cisco doesn’t treat the server with much respect apparently, they treat it as something that can fail and you just get it replaced or repaired at that point.

UCS: Post failure response

UCS: Post failure response

That model reminds me of what I call built to fail which is the model that public clouds like Amazon and stuff run on. It’s pretty bad. Though at least in Cisco’s case the storage is shared and the application can be restarted on another system easily enough, public cloud you have to build a new system and configure it from scratch.

The point here is obviously, HP works hard to prevent the outage in the first place, Cisco doesn’t seem to care.

Simplicity Matters

I’ll just put the full slide here there’s not a whole lot to cover. HP’s point here is the Cisco way is more complicated and seems angled to drive more revenue for the network. HP is less network oriented, and they show you can directly connect the blade chassis to a 3PAR storage system(s). I think HP’s diagram is even a bit too complicated for all but the largest setups you could easily eliminate the distribution layer.

BladeSystem vs UCS: Simplicity matters

BladeSystem vs UCS: Simplicity matters

The cost of the 17th server

I found this interesting as well, Cisco goes around telling folks that their systems are cheaper, but they don’t do an apples to apples comparison, they use a Smart Play Bundle, not a system that is built to scale.

HP put a couple of charts up showing the difference in cost between the two solutions.

BladeSystem vs UCS TCO: UCS Smart Play bundle

BladeSystem vs UCS TCO: UCS Smart Play bundle

BladeSystem vs UCS: UCS Built to scale

BladeSystem vs UCS TCO: UCS Built to scale

Portfolio Matters

Lastly HP went into some depth on comparing the different product portfolios and showed how Cisco was lacking in pretty much every area whether it was server coverage, storage coverage, blade networking options, software suites and the integration between them.

They talked about how Cisco has one way to connect networking to UCS, HP has many whether it is converged ethernet(similar to Cisco), or regular ethernet, native Fibre channel, Infiniband, and even SAS to external disk enclosures. The list goes on and on for the other topics but I’m sure you get the point. HP offers more options so you can build a more optimal configuration for your application.

BladeSystem vs UCS: Portfolio matters

BladeSystem vs UCS: Portfolio matters

Then they went into analyst stuff and I took a nap.

In reviewing the slide deck they do mention Dell once.. in the slide, not by the speaker –

HP vs Dell in drivers/firmware management

HP vs Dell in drivers/firmware management

By attending this I didn’t learn anything that would affect my purchasing in the future, as I mentioned I won’t buy Cisco for any reason already. But it was still interesting to hear about.

August 7, 2013

Nth Symposium 2013: HP Moonshot

Filed under: General — Tags: , , — Nate @ 10:05 pm

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

HP launched Moonshot a few months ago, I wrote at the time I wasn’t overly impressed. At the Nth Symposium there were several different HP speakers that touched on Moonshot.

HP has been blasting the TV airwaves with Moonshot ads – something that I think is a waste of money – just as much as it would be if HP were blasting the TV with 3PAR ads. Moonshot obviously is a special type of system- and those in that target market will obviously (to me anyway) know about it. Perhaps it’s more of an ad to show HP is innovating still, in which case it’s pretty decent (not as good as the IBM Linux commercials from years back though!).

Initial node for HP Moonshot

Initial node for HP Moonshot for Intel Atom processors

Sure it is cute, the form factor certainly grabs your eye. Micro servers are nothing new though, HP is just the latest entrant into the market. I immediately got into tech mode and wanted to know how it measured up to AMD’s Seamicro platform. In my original post I detail several places where I feel Moonshot falls short of Seamicro which has been out for years.

Seamicro Node for Intel Atom processors

Seamicro Node for Intel Atom processors – Note no storage! All of that is centralized in the chassis, virtualized so that it is very flexible.

HP touts this as a shift in the way of thinking – going from building apps around the servers to building servers around the apps (while they sort of forgot to mention we’ve been building servers around the apps in the form of VMs for many years now). I had not heard of the approach described like that until last week, it is an interesting description.

HP was great in being honest about who should use this system – they gave a few different use cases, but they were pretty adamant that Moonshot is not going to take over the world, it’s not going to invade every SMB and replace your x86 systems with ARM or whatever. It’s a purpose built system for specific applications. There is only one Moonshot node today, in the future there will be others, each targeted at a specific application.

One of them will even have DSPs on it I believe, which is somewhat unique. HP calls Moonshot out as:

  • 77% less costly
  • 89% less energy
  • 80% less space
  • 97% less complex

Certainly very impressive numbers. If I had an application that was suitable for Moonshot then I’d certainly check it out.

One of the days that I was there I managed to get myself over to the HP Moonshot booth and ask the technical person there some questions. I don’t know what his role was, but he certainly seemed well versed in the platform and qualified to answer my basic questions.

My questions were mainly around comparing Moonshot to Seamicro – specifically the storage virtualization layers and networking as well. His answers were about what I expected. They don’t support that kind of thing, and there’s no immediate plans to. Myself, I think the concept of being able to export read-only file system(s) from central SSD-storage to dozens to hundreds of nodes within the Seamicro platform a pretty cool idea. The storage virtualization sounds very flexible and optionally extends well beyond the Seamicro chassis up to ~1,400 drives.

Same for networking, Moonshot is pretty basic stuff. (At one point Seamicro advertised integrated load balancing but I don’t see that now).  The HP person said Moonshot is aimed squarely at web applications, scale out etc.. Future modules may be aimed at memcache nodes, or other things.. There will also be a storage module as well(I forget specifics but it was nothing too exciting).

I believe the HP rep also mentioned how they were going to offer units with multiple servers on a single board (Seamicro has done this for a while as well).

Not to say Moonshot is a bad system, I’m sure HP will do pretty well with them, but I find it hard to get overly excited about it when Seamicro seems to be years ahead of Moonshot already. Apparently Moonshot was in HP Labs for a decade, and it wasn’t until one of the recent CEOs(I think a couple of years ago) came around to HP Labs and said something like “What do you have that I can sell?” and the masterminds responded “We have Moonshot!”, and it took them a bit of time to productize it.

(I have no personal experience with either platform nor have I communicated with anyone who has told me personal experience with either platform so I can only go by what I have read/been told of either system at this point)

Nth Symposium 2013 Keynote: SDN

Filed under: Networking — Tags: , — Nate @ 9:11 am

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

“So, SDN solves a problem for me which doesn’t exist, and never has.”

– Nate (techopsguys.com)

(I think the above quote sums up my thoughts very well so I put it in at the top, it’s also buried down below too)

One of the keynotes of the Nth Generation Symposium last week was from Martin Casado, who is currently a Chief Architect at VMware, and one of the inventors of OpenFlow and the SDN concept in general.

I have read bits and pieces about what Martin has said in the past, he seems like a really smart guy and his keynote was quite good. It was nice to hear confirmation from him many of the feelings I have on SDN in general. There are some areas that I disagree with him on, that is mainly based on my own personal experience in environments I have worked in – the differences are minor, my bigger beef with SDN is not even inside the scope of SDN itself, more on that in a bit.

First off, I was not aware that the term Software Defined Networking was created on the spot by some reporter of the MIT Technology Review. Apparently this reporter who was interviewing Martin had just done an article on Software Defined Radio, the reporter asked Martin what should they call this thing he created? He didn’t know, so the reporter suggested Software Defined Networking since that term was still fresh in the reporter’s head. He agreed and the term was born..

Ripping from one of his slides:

What does SDN Promise?

  • Enable rapid innovation in Networking
  • Enable new forms of network control
  • It’s a mechanism for implementers
  • Not a solution for customers

That last bit I did not notice until a few moments ago, that is great to see as well.

He says network virtualization is all about operational simplification

Martin's view of Network Virtualization

Martin’s view of Network Virtualization

What Network Virtualization is

  • Decoupling of the services provided by a virtualized network from the physical network
  • Virtual network is a container of network services (L2-L7) provisioned by software
  • Faithful reproduction of services provided by physical network

He showed an interesting stat claiming that half of all server access ports are already virtualized, and we’re on track to get to 67% in 2 years. Also apparently 40% of virtualization admins also manage virtual switching.

Here is an interesting slide showing a somewhat complex physical network design and how that can be adapted to be something more flexible with SDN and network virtualization:

The migration of physical to virtual

The migration of physical to virtual

Top three reasons for deploying software defined networks

  1. Speed
  2. Speed
  3. Speed

(from another one of Martin’s slides – and yes he had #1,#2,#3 as all the same anything beyond speed was viewed as a distant reason relative to speed)

Where I stand on Martin’s stuff

So first off let me preface this as I am a customer. I have managed L2-L7 networks off and on for the past 12 years now, on top of all of my other stuff. I have designed and built from the ground up a few networks. Networking has never been my primary career path. I couldn’t tear apart an IP packet and understand it if my life depended on it. That being said I have been able to go toe to toe with every “Network Engineer” I have worked with(on almost everything except analyzing packet dumps beyond the most basic of things). I don’t know if that says something about me, or them, or both.

I have worked in what you might consider nothing but “web 2.0” stuff for the past decade. I have never had to support big legacy applications, everything has been modern web based stuff. In two cases it was a three tier application (web+app+db) the others were two tier. I have supported Java, PHP, Ruby and Perl apps (always on Linux).

None of the applications I supported were “web scale” (and I will argue till I am blue in the face that most(99%) organizations will never get to web scale). The biggest scaling application was at the same time my first application – I calculated the infrastructure growth as 1,500%(based on raw CPU capacity) over roughly 3 years – to think the ~30 racks of servers could today fit into a single blade enclosure with room to spare..

What does SDN solve?

Going briefly to another keynote by someone at Intel they had this slide, which goes to show some of the pain they have –

Intel's network folks take 2-3 weeks to provision a service

Intel’s network folks take 2-3 weeks to provision a service

Intel’s own internal IT estimates say it takes them 2-3 weeks to provision a new service. This makes really no sense to me, but there is no description as to what is involved with configuring a new service.

So going back to SDN. From what I read, SDN operates primarily at L2-L3. The firewalls/load balancers etc are less SDN and more network virtualization and seem to be outside the scope of core SDN (OpenFlow). To-date I have not seen a single mention of the term SDN when it comes to these services from any organization. It’s all happening at the switch/routing layer.

So I have to assume here for a moment that it takes Intel 2-3 weeks to provision new VLANs, perhaps deploy some new switches, or update some routes or something like that (they must use Cisco if it takes that long!).

My own network designs

Going to my own personal experience – keeping things simple.  Here is a sample network design of mine that is recent:

Basic Network Zoning architecture

Basic Network Zoning architecture

There is one major zone for the data center itself, which is a /16(levering Extreme’s Layer 3 Virtual switching), then within that, at the moment are three smaller zones (I think supernet may be the right word to describe them), and within those supernets are sub zones (aka subnets aka VLANs). A couple of different sizes for different purposes. Some of the sub zones have jumbo frames enabled, most do not. There is a dedicated sub zone for Vmotion(this VLAN has no router interface on it, in part for improved security perhaps), infrastructure management interfaces, etc. Each zone (A-C) has a sub zone dedicated to load balancer virtual IPs for internal load balancing. The load balancer is directly connected to all of the major zones. Routing to this data center (over VPN – either site to site, or end user VPN) is handled by a simple /16 route, and individual WAN-based ACLs are handled by the VPN appliance.

There are a few misc zones in the middle for various purposes, these have no access restrictions on them at all. Well the VPN client stuff, the ACLs for those are handled by the VPN appliance, not by the rest of the network.

This specific network design is not meant to be extremely high security as that need does not exist in this organization (realistic need, I have seen on several occasions network engineers over engineer something for security when it really was not required and as a result introduce massive bottlenecks into the network – this became an even greater concern for me with all servers running with multiple 10GbE links). The access controls are mainly to protect casual mistakes. Internet facing services in all zones have the same level of security, so if you happen to be able to exploit one of them(I’ve never seen this happen at any company on anything I’ve been responsible for – not that I go to paranoid lengths to secure things either), there’s nothing stopping you from exploiting the others in the exact same way. Obviously nothing is directly connected to the internet other than the load balancer(which runs a hardened operating system), and a site to site VPN appliance(also hardened).

The switch blocks TCP SYN  & UDP packets between the respective zones above, since it is not stateful. The switch operates at line rate 10GbE w/ASIC-based ACLs, and performing this function in a hardware (or software) firewall I figured would be too much complexity and reduce performance (not to mention the potential costs of a firewall that is capable of line rate 10Gbps+ – given multiple servers each with multiple 10GbE ports – the possibility exists of throughput far exceeding that of 10Gbps – with the switch it is line rate on every port – up to 1.2Tbps on this switching platform – how much is that firewall again?).

There are four more VLANs related to IP-based storage- two for production and two for non production though they have never really been used to-date. I have the 3PAR iSCSI on these VLANs, with jumbo frames(the purpose of the VLANs), though all of the client systems at the moment use standard frame sizes (iSCSI runs on top of TCP which provides MTU auto negotiation).

There is a pair of hardware load balancers, each has a half dozen or so VLANs, each zone has a dedicated load balancer VLAN for that zone, for services in that zone. The LBs are also the connected to the internet of course, in a two-armed configuration.

Sample two-arm configuration for a LB

Sample two-arm configuration for a LB from Citrix documentation

I have a similar configuration in another data center using a software load balancer of the same type – however the inability to support more than 4 NICs (4 VLANs at least in vSphere 4.1 – not sure if this is increased in 5.x) limits the flexibility of that configuration relative to the physical appliances, so I had to make a few compromises in the virtual’s case.

So I have all these VLANs, a fully routed layer 3 switching configuration, some really basic ACLs to prevent certain types of communication, load balancers to route traffic from the internet as well as distribute load in some cases.

Get to the point already!

The point of all of this is things were designed up front, provisioned up front, and as a result over the past 18 months we have not had to make any changes to this configuration despite  more than doubling in size during that time. We could double again and not have a problem. Doubling again beyond that I may need to add one or two VLANs (sub zones), though I believe the zones as they exist today could continue to exist, I would not have to expand them. I really do not think the organization running this will ever EVER get to that scale. If they do then they’re doing  many billions in revenue a year and we can adapt the system if needed(and probably at that point we’d have one or more dedicated network engineers who’d likely promptly replace whatever I have built with something significantly more(overly so) complicated because they can).

If we are deploying a new application, or a new environment we just tell VMware where to plop the VM. If it is QA/Dev then it goes in that zone, if it is testing, it goes in another, production etc.. blah blah…

More complexity outside switching+routing

The complexity when deploying a new network service really lies in the load balancer from an network infrastructure perspective. Not that it is complicated but that stuff is not pre-provisioned up front. Tasks include:

  • Configuring server name to IP mappings (within the LB itself)
  • Creating Service group(s) & adding servers to the service groups
  • Creating virtual server(s) & assigning IPs + DNS names to them
  • Creating content switching virtual server(s) & assigning IPs + DNS names to them
  • Configuring content switching virtual server(s) – (adding rules to parse HTTP headers and route traffic accordingly)
  • Importing SSL cert(s) & assigning them to the virtual servers & cs virtual servers

The above usually takes me maybe 5-20 minutes depending on the number of things I am adding. Some of it I may do via GUI, some I may do via CLI.

None of this stuff is generic, unless we know specifically what is coming we can’t provision that in advance(I’m a strong believer in solid naming conventions – which means no random names!!!).

The VMs by contrast are always very generic(other than the names of course), there’s nothing special to them, drop them in the VLAN they need to be and they are done – we have no VMs that I can think of that have more than one vNIC other than the aforementioned software load balancers. Long gone are the days (for me) where a server was bridged between two different networks – that’s what routers are for.

Network is not the bottleneck for deploying a new application

In fact in my opinion the most difficult process of getting a new application up and running is getting the configuration into Chef. That is by far the longest part of any aspect of the provisioning process. It can take me, or even us, hours to days to get it properly configured and tested. VMs take minutes, load balancer takes minutes. Obviously a tool like Chef makes it much easier to scale an existing application since the configuration is already done. This blog post is all about new applications or network services.

Some of the above could be automated with using the APIs on the platform(they’ve been there for years), and some sort of dynamic DNS or whatever. The amount of work involved to build such a system for an operation of our scale isn’t worth the investment.

The point here is, the L2/L3 stuff is trivial – at least for an operation that we run at today – and that goes for all of the companies I have worked at for the past decade. The L2/L3 stuff flat out doesn’t change very often and doesn’t need to. Sometimes if there are firewalls involved perhaps some new holes need to be poked in them but that just takes a few minutes — and from what I can tell is outside the scope of SDN anyway.

I asked Martin a question on that specific topic. It wasn’t well worded but he got the gist of it. My pain when it comes to networking is not the L2/L3 area – it is the L7 area. Well if we made extensive use of firewalls than L3 fire-walling would be an issue as well. So I asked him how SDN addresses that(or does it). He liked the question and confirmed that SDN does not in fact address that. That area should be addressed by a “policy management tool” of some kind.

I really liked his answer – it just confirms my thoughts on SDN are correct.

Virtual Network limitations

I do like the option of being able to have virtual network services, whether it is a load balancer or firewall or something. But those do have limitations that need to be accounted for. Whether it is performance, flexibility (# of VLANs etc), as well as dependency (you may not want to have your VPN device in a VM if your storage shits itself you may lose VPN too!). Managing 30 different load balancers may in fact be significantly more work(I’d wager it is- the one exception is service provider model where you are delegating administrative control to others – which still means more work is involved it is just being handled by more staff) than managing a single load balancer that supports 30 applications.

Citrix Netscaler Traffic Flow

Citrix Netscaler Cluster Traffic Flow

Above is a diagram from Citrix from an earlier blog post I wrote about last year. At the time their clustering tech scaled to 32 systems, which if that still holds true today, at the top end @ 120Gbps/system that’d be nearly 4Tbps of theoretical throughput. Maybe cut that in half to be on the safe side, so roughly 2Tbps..that is quite a bit.

Purpose built hardware network devices have long provided really good performance and flexibility. Some of them even provide some layer of virtualization built in. This is pretty common in firewalls. More than one load balancing company has appliances that can run multiple instances of their software as well in the event that is needed. I think the instances that would be required (outside of a service provider giving each customer their own LB) is quite limited.

Design the network so when you need such network resources you can route to them easily – it is a network service after all, addressable via the network – it doesn’t matter if it lives in a VM or on a physical appliance.

VXLAN

One area that I have not covered with regards to virtualization is something that VXLAN offers, which is make the L2 network more portable between data centers and stuff. This is certainly an attractive feature to have for some customers, especially if perhaps you rely on something like VMware’s SRM to provide fail over.

My own personal experience says VXLAN is not required, nor is SRM. Application configurations for the most part are already in a configuration management platform. Building new resources at a different data center is not difficult (again in my experience most of the applications I have supported this could even be done in advance), in the different IP space and slightly different host names (I leverage the common airportcode.domain for each DC to show where each system is physically located). Replicate the data(use application based replication where available e.g. internal database replication) that is needed(obviously that does not include running VM images) and off you go. Some applications are more complex, most web-era applications are not though.

So, SDN solves a problem for me which doesn’t exist, and never has.

I don’t see it existing in the future for most smaller scale (sub hyper scale) applications unless your network engineers are crazy about over engineering things. I can’t imagine what is involved that takes 2-3 weeks to provision a new network service at Intel. I really can’t.  Other than perhaps procuring new equipment, which can be a problem regardless.

Someone still has to buy the hardware

Which leads me into a little tangent. Just because you have cloud doesn’t mean you automatically have unlimited capacity. Even if your Intel, if someone internally built something on their cloud platform(assuming they have one), and said “I need 100,000 VMs each with 24 CPUs and I plan to drive them to 100% utilization 15 hours a day, even with cloud, I think it is unlikely they have that much capacity provisioned as spare just sitting around(and if they do that is fairly wasteful!).

Someone has to buy and provision the hardware, whether it is in a non cloud setup, or in a cloud setup. Obviously once provisioned into a pool of “cloud” (ugh) it is easier to adapt that system to be used for multiple purposes. But the capacity has to exist, in advance of the service using it. Which means someone is going to spend some $$ and there is going to be some lead time to get the stuff in & set it up. An extreme case for sure, but consider if you need to deploy on the order of 10s of thousands of new servers that lead time may be months, to get the floor space/power/cooling alone.

I remember a story I heard from SAVVIS many years ago, a data center they operated in the Bay Area had a few 10s of thousands of square feet available, and it was growing slow and steady. One day Yahoo! walks in and says I want all of your remaining space. Right Now.  And poof it was given to them. There was a data center company Microsoft bought (forgot who now) and there was one/more facilities up in the Seattle area where (I believe) they kicked out the tenants of the company they bought so they could take over the facility entirely(don’t recall how much time they granted the customers to GTFO but I don’t recall hearing them being polite about it).

So often — practically all the time — when I see people talk about cloud they think that the stuff is magical and no matter how much capacity you need it just takes minutes to be made available (Intel slide above). Now if you are a massive service provider like perhaps Amazon, Microsoft, Google  – you probably do have 100,000 systems available at any given time. Though the costs of public cloud are ..not something I will dive into again in this post, I have talked about that many times in the past.

Back to Martin’s Presentation

Don’t get me wrong — I think Martin is a really smart guy and created a wonderful thing. My issue isn’t with SDN itself, it’s much more with the marketing and press surrounding it, making it sound like everyone needs this stuff! Buy my gear and get SDN!! You can’t build a network today without SDN!! Even my own favorite switching company Extreme Networks can’t stop talking about SDN.

Networking has been boring for a long time, and SDN is giving the industry something exciting to talk about. Except that it’s not exciting – at least not to me, because I don’t need it.

Anyway one of Martin’s last slides is great as well

Markitechture war with SDN

Markitechture war with SDN

Self explanatory, I especially like the SDN/python point.

 

 Conclusion

I see SDN as a great value primarily for service providers and large scale operations at this point. Especially in situations where providers are provisioning dedicated network resources for each customer(network virtualization here works great too).

At some point, perhaps when SDN matures more and it becomes more transparent, then mere mortals will probably find it more useful. As Martin says in one of his first slides, SDN is not for customers(me?), it’s for implementers (that may be me too depending on what he means there, but I think it’s more for the tools builders, people who make things like cloud management interfaces, vCenter programmers etc).

Don’t discount the power/performance benefits of ASICs too much. They exist for a reason, if network manufacturers could build 1U switches to shift 1+Tbps of data around with nothing more than x86 CPUs and have a reasonable power budget I have no doubt they would. Keep this in mind when you think about a network running in software.

If you happen to have a really complicated network then SDN may provide some good value there. I haven’t worked in such an organization, though my first big network(my biggest) was a bit complicated (though it was simpler than the network that it replaced), I learned some good things from that experience and adapted future designs accordingly.

I’ll caveat this all by saying the network design work I have done again has been built for modern web applications, I don’t cover ultra security things like say processing credit cards (that IMO would be a completely physically separate infrastructure for that subsystem to limit the impact of PCI and other compliance things – that being said my first network again did process credit cards directly – this was before PCI compliance existed though, there were critical flaws in the application with regards to credit card security at the time as well). Things are simple, and fairly scalable (not difficult to get to low thousands of systems easily, and that already eclipses the networks of most organizations out there by a big margin).

I believe if your constantly making changes to your underlying L2/L3 network (other than say perhaps adding physical devices to support more capacity) then you probably didn’t design it right to begin with (maybe not your fault). If you need to deploy a new network service, just plug it in and go..

For myself – my role has always been a hybrid of server/storage/network/etc management. So I have visibility into all layers of the application running on the network. So perhaps that makes me better equipped to design things in a way vs. someone who is in a silo and has no idea what the application folks are doing.

Maybe an extreme example but now that I wrote that I remember back many years ago, we had a customer who was a big telco, and their firewall rule change process was once a month a dozen or more people from various organizations(internal+external) get on a conference call to co-ordinate firewall rule changes(and to test connectivity post changes). It was pretty crazy to see. You probably would of had to get the telco’s CEO approval to get a firewall change in that was outside that window!

Before I go let me give a shout out to my favorite L3 switching fault tolerance protocol: ESRP.

I suppose the thing I hesitate most about this post, is paranoid around missing some detail which invalidates every network design I’ve ever done and makes me look like even more of an idiot than I already am!! Though I have talked with enough network people over the years that I don’t believe that will happen…

If you’re reading this and are intimately familiar with an organization that takes 2-3 weeks to spin up a network service I’d like to hear from you (publicly or privately) as to the details around what specifically takes the bulk of that time. Anonymous is fine too, I won’t write anything on it if you don’t want me to. I suspect the bulk of the time is red tape – processes, approvals etc..and not related to the technology.

So, thanks Martin for answering my questions at the conference last week! (I wonder if he will read this…some folks have google alerts for things that are posted and stuff). If you are reading this and you are wondering – yes I really have been a VMware customer for 14 years – going back to pre 1.0 days when I was running VMware on top of Linux. I still have my CD of Vmware 1.0.2 around here somewhere — I think that was the first available physical media distributed. Though my loyalty to VMware has eroded significantly in recent years for various reasons.

August 2, 2013

HP Storage Tech Day – bits and pieces

Filed under: Storage — Tags: , , — Nate @ 9:56 am

Travel to HP Storage Tech Day/Nth Generation Symposium was paid for by HP; however, no monetary compensation is expected nor received for the content that is written in this blog.

For my last post on HP Storage tech day, the remaining topics that were only briefly covered at the event.

HP Converged Storage Management

There wasn’t much here other than a promise to build YASMT (Yet another storage management tool), this time it will be really good though. HP sniped at EMC on several occasions for the vapor-ness of ViPR. Though at least that is an announced product with a name. HP has a vision, no finalized name, no product(I’m sure they have something internally) and no dates.

I suppose if your in the Software defined storage camp which is for the separation of data and control plane, this may be HP’s answer to that.

HP Converged Storage Management Strategy

HP Converged Storage Management Strategy

The vision sounds good as always, time will tell if they can pull it off. The track record for products like this is not good. More often than not the tools lower the bar on what is supported to some really basic set of things, and are not able to exploit the more advanced features of the platform under management.

One question I did ask is whether or not they were going to re-write their tools to leverage these new common APIs, and the answer was sort of what I expected – they aren’t. At least short term the tools will use a combination of these new APIs as well as whatever methods they use today. So this implies that only a subset of functionality will be available via the APIs.

In contrast I recall reading something, perhaps a blog post, about how NetApp’s tools use all of their common APIs(I believe end to end API stuff for them is fairly recent). HP may take a couple of years to get to something like that.

HP Openstack Integration

HP is all about the Openstack. They seem to be living and breathing it. This is pretty neat, I think Openstack is a good movement, though the platform still seems some significant work to mature.

I have concerns, short term concerns about HP’s marketing around Openstack and how easy it is to integrate into customer environments. Specifically Openstack is a fast moving target, lacks maturity and at least as recently as earlier this year lacked a decent community of IT users (most of it was centered on developers – probably still is). HP’s response is they are participating deeply within the community (which is good long term), and are being open about everything (also good).

I specifically asked if HP was working with Red Hat to make sure the latest HP contributions (such as 3PAR support, Fibre Channel support) were included in the RH Open Stack. They said no, they are working with the community, and not partners. This is of course good and bad. Good that they are being open, bad in that it may result in some users not getting things for 12-24 months because the distribution of Openstack they chose is too old to support it.

I just hope that Openstack matures enough that it gets a stable set of interfaces. Unlike say the Linux kernel driver interfaces which just annoy the hell out of me(have written about that before). Compatibility people!!!

Openstack Fibre Channel support based on 3PAR

HP wanted to point out that the Fibre Channel support in Openstack was based on 3PAR. It is a generic interface and there are plugins for a few different array types. 3PAR also has iSCSI support for Openstack as of a recent 3PAR software release as well.

StoreVirtual was first Openstack storage platform

Another interesting tidbit is that Storevirtual was the first(?) storage platform to support Openstack. Rackspace used it(maybe still does, not sure), and contributed some stuff to make it better. HP also uses it in their own public cloud(not sure if they mentioned this or not but I heard from a friend who used to work in that group).

HP Storage with Openstack

Today HP integrates with Openstack at the block level on both the StoreVirtual and 3PAR platforms. Work is in progress for StoreAll which will provide file and object storage. Fibre channel support is available on the 3PAR platform only as far as HP goes. StoreVirtual supports Fibre Channel but not with Openstack(yet anyway, I assume support is coming).

This contrasts with the competition, most of whom have no Openstack support and haven’t announced anything to be released anytime soon. HP certainly has a decent lead here, which is nice.

HP Openstack iSCSI/FC driver functionality

All of HP’s storage work with Openstack is based on the Grizzly release which came out around April 2013.

  • Create / Delete / Attach / Detach volumes
  • Create / Delete Snapshots
  • Create volume from snapshot
  • Create cloned volumes
  • Copy image to volume / Copy volume to image (3PAR iSCSI only)

New things coming in Havana release of Openstack from HP Storage

  • Better session management within the HP 3PAR StoreServ Block Storage Drivers
  • Re-use of existing HP 3PAR Host entries
  • Support multiple 3PAR Target ports in HP 3PAR StoreServ Block Storage iSCSI Driver
  • Support Copy Volume To Image & Copy Image To Volume with Fibre Channel Drivers (Brick)
  • Support Quality of Service (QoS) setting in the HP 3PAR StoreServ Block Storage Drivers
  • Support Volume Sets with predefined QoS settings
  • Update the hp3parclient that is part of the Python Standard Library

Fibre channel enhancements for Havana and beyond

Fibre Channel enhancements for Openstack Havana and beyond

Fibre Channel enhancements for Openstack Havana and beyond

Openstack portability

This was not at the Storage Tech Day – but I was at a break out session that talked about HP and Openstack at the conference and one of the key points they hit on was the portable nature of the platform, run it in house, run it in cloud system, run it at service providers and move your workloads between them with the same APIs.

Setting aside a moment the fact that the concept of cloud bursting is a fantasy for 99% of organizations out there(your applications have to be able to cope with it, your not likely going to be able to scale your web farm and burst into a public cloud when those web servers have to hit databases that reside over a WAN connection the latency hit will make for a terrible experience).

Anyway setting that concept aside – you still have a serious problem- short term of compatibility across different Openstack implementations because different vendors are choosing different points to base their systems off of. This is obviously due to the fast moving nature of the platform and when the vendor decides to launch their project.

This should stabilize over time, but I felt the marketing message on this was a nice vision, it just didn’t represent any reality I am aware of today.

HP contrasted this to being locked in to say the vCloud API. I think there are more clouds public and private using vCloud than Openstack at this point. But in any case I believe use cases for the common IT organization to be able to transparently leverage these APIs to burst on any platform- VMware, Openstack, whatever – is still years away from reality.

If you use Openstack’s API, you’re locked into their API anyway. I don’t leverage APIs myself(directly) I am not a developer – so I am not sure how easy it is to move between them. I think the APIs are likely much less of a barrier than the feature set of the underlying cloud in question. Who cares if the API can do X and Y, if the provider’s underlying infrastructure doesn’t yet support that capability.

One use case that could be done today, that HP cited, is running development in a public cloud then pulling that back in house via the APIs. Still that one is not useful either. The amount of work involved in rebuilding such an environment internally should be fairly trivial anyway(the bulk of the work should be in the system configuration area, if your using cloud you should also be using some sort of system management tool, whether it is something like CFEngine, Puppet, Chef, or something similar). That and – this is important in my experience – development environments tend to not be resource intensive. Which makes it great to consolidate them on internal resources (even ones that share with production – I have been doing this since for six years already).

My view on Openstack

At least one person at HP I spoke with believes most stuff will be there by the end of this year but I don’t buy that for a second. I look at things like Red Hat’s own Openstack distribution taking seemingly forever to come out(I believe it’s a few months behind already and I have not seen recent updates on it), and Rackspace abandoning their promise to support 3rd party Open stack clouds earlier this year.

All of what I say is based on what I read — I have no personal experience with Openstack (nor do I plan to get immediate experience, the lack of maturity of the product is keeping me away for now). Based on what I have read, conferences(was at a local Red Hat conference last December where they covered Openstack – that’s when reality really hit me and I learned a good deal about it and honestly lost some enthusiasm in the project) and some folks I have chatted/emailed with Openstack is still a VERY much work in progress, evolving quickly. There’s really no formal support community in place for a stable product, developers are wanting to stay at the leading edge and that’s what they are willing to support. Red Hat is off in one corner trying to stabilize the Folsum release from last year to make a product out of it, HP is in another corner contributing code to the latest versions of Openstack that may or may not be backwards compatible with Red Hat or other implementations.

It’s a mess.. it’s a young project still so it’s sort of to be expected. Though there are a lot of folks making noise about it. The sense I get is if you are serious about running an Open Stack cloud today, as in right now, you best have some decent developers in house to help manage and maintain it. When Red Hat comes out with their product, it may solve a bunch of those issues, but still it’ll be a “1.0”, and there’s always some not insignificant risk to investing in that without a very solid support structure inside your organization (Red Hat will of course provide support but I believe that won’t be enough for most).

That being said it sounds like Openstack has a decent future ahead of it – with such large numbers of industry players adopting support for it, it’s really only a matter of time before it matures and becomes a solid platform for the common IT organization to be able to deploy.

How much time? I’m not sure. My best guesstimate is I hope it can reach that goal within five years. Red Hat, and others should be on versions 3 and perhaps 4 by then. I could see someone such as myself starting to seriously dabble in it in the next 12-16 months.

Understand that I’m setting the bar pretty high here.

Last thoughts on HP Storage Tech Day

I had a good time, and thought it was a great experience. They had very high caliber speakers, were well organized and the venue was awesome as well. I was able to drill them pretty good, the other bloggers seemed to really appreciate that I was able to drive some of the technical conversations. I’m sure some of my questions they would of rather not of answered since the answers weren’t always “yes we’ve been doing that forever..!”, but they were honest and up front about everything. When they could not be, they said so(“can’t talk about that here we need a Nate Disclosure Agreement”).

I haven’t dealt much at all with the other internal groups at HP, but I can say the folks I have dealt with on the storage side have all been AWESOME. Regardless of what I think about whatever storage product they are involved with they are all wonderful people both personally and professionally.

These past few posts have been entirely about what happened on Monday.  There are more bits that happened at the main conference on Tues-Thur, and once I get the slides for those slide decks I’ll be writing more about that, there were some pretty cool speakers. I normally steer far clear of such events, this one was pretty amazing though. I’ll save the details for the next posts.

I want to thank the team at HP, and Ivy Worldwide for organizing/sponsoring this event – it was a day of nothing but storage (and we literally ran out of time at the end, one or two topics had to be skipped). It was pretty cool. This is the first event I’ve ever traveled for, and the only event where there was some level of sponsorship (as mentioned HP covered travel, lodging and food costs).

Powered by WordPress