TechOpsGuys.com Diggin' technology every day

August 29, 2011

Farewell Terremark – back to co-lo

Filed under: General,Random Thought,Storage,Virtualization — Tags: , , , — Nate @ 9:43 pm

I mentioned not long ago that I was going co-lo once again. I was co-lo for a while for my own personal services but then my server started to act up (the server was 6 years old if it was still alive today) with disk “failure” after failure (or at least that’s what the 3ware card was predicting eventually it stopped complaining and the disk never died again). So I thought – do I spent a few grand to buy a new box or go “cloud”. I knew up front cloud would cost more in the long run but I ended up going cloud anyways as a stop gap – I picked Terremark because it had the highest quality design at the time(still does).

During my time with Terremark I never had any availability issues, there was one day where there was some high latency on their 3PAR arrays though they found & fixed whatever it was pretty quick (didn’t impact me all that much).

I had one main complaint with regards to billing – they charge $0.01 per hour for each open TCP or UDP port on their system, and they have no way of doing 1:1 NAT. For a web server or something this is no big deal, but for me I needed a half dozen or more ports open per system(mail, dns, vpn, ssh etc) after cutting down on ports I might not need, so it starts to add up, indeed about 65% of my monthly bill ended up being these open TCP and UDP ports.

Once both of my systems were fully spun up (the 2nd system only recently got fully spun up as I was too lazy to move it off of co-lo) my bill was around $250/mo. My previous co-lo was around $100/mo and I think I had them throttle me to 1Mbit of traffic (this blog was never hosted at that co-lo).

The one limitation I ran into on their system was that they could not assign more than 1 IP address for outbound NAT per account. In order to run SMTP I needed each of my servers to have their own unique outbound IP. So I had to make a 2nd account to run the 2nd server. Not a big deal(for me, ended up being a pain for them since their system wasn’t setup to handle such a situation), since I only ran 2 servers (and the communications between them were minimal).

As I’ve mentioned before, the only part of the service that was truly “bill for what you use” was bandwidth usage, and for that I was charged between 10-30 cents/month for my main system and 10 cents/month for my 2nd system.

Oh – and they were more than willing to setup reverse DNS for me which was nice (and required for running a mail server IMO). I had to agree to a lengthy little contract that said I wouldn’t spam in order for them to open up port 25. Not a big deal. The IP addresses were “clean” as well, no worries about black listing.

Another nice thing to have if they would of offered it is billing based on resource pools, as usual they charge for what you provision(per VM) instead of what you use. When I talked to them about their enterprise cloud offering they charged for the resource pool (unlimited VMs in a given amount of CPU/memory), but this is not available on their vCloud Express platform.

It was great to be able to VPN to their systems to use the remote console (after I spent an hour or two determining the VPN was not going to work in Linux despite my best efforts to extract linux versions of the vmware console plugin and try to use it). Mount an ISO over the VPN and install the OS. That’s how it should be. I didn’t need the functionality but I don’t doubt I would of been able to run my own DHCP/PXE server there as well if I wanted to install additional systems in a more traditional way. Each user gets their own VLAN, and is protected by a Cisco firewall, and load balanced by a Citrix load balancer.

A couple of months ago the thought came up again of off site backups. I don’t really have much “critical” data but I felt I wanted to just back it all up, because it would be a big pain if I had to reconstruct all of my media files for example. I have about 1.7TB of data at the moment.

So I looked at various cloud systems including Terremark but it was clear pretty quick no cloud company was going to be able to offer this service in a cost effective way so I decided to go co-lo again. Rackspace was a good example they have a handy little calculator on their site. This time around I went and bought a new, more capable server.

So I went to a company I used to buy a ton of equipment from in the bay area and they hooked me up with not only a server with ESXi pre-installed on it but co-location services (with “unlimited” bandwidth), and on-site support for a good price. The on-site support is mainly because I’m using their co-location services(which in itself is a co-lo inside Hurricane Electric) and their techs visit the site frequently as-is.

My server is a single socket quad core processor, 4x2TB SAS disks (~3.6TB usable which also matches my usable disk space at home which is nice – SAS because VMware doesn’t support VMFS on SATA though technically you can do it the price premium for SAS wasn’t nearly as high as I was expecting), 3ware RAID controller with battery backed write-back cache, a little USB thing for ESXi(rather have ESXi on the HDD but 3ware is not supported for booting ESXi), 8GB Registered ECC ram and redundant power supplies. Also has decent remote management with a web UI, remote KVM access, remote media etc. For co-location I asked (and received) 5 static IPs (3 IPs for VMs, 1 IP for ESX management, 1 IP for out of band management).

My bandwidth needs are really tiny, typically 1GB/month. Though now with off site backups that may go up a bit (in bursts). Only real drawback to my system is the SAS card does not have full integration with vSphere so I have to use a cli tool to check the RAID status, at some point I’ll need to hook up nagios again and run a monitor to check on the RAID status. Normally I setup the 3Ware tools to email me when bad things happen, pretty simple, but not possible when running vSphere.

The amount of storage on this box I expect to last me a good 3-5 years. The 1.7TB includes every bit of data that I still have going back a decade or more – I’m sure there’s a couple hundred gigs at least I could outright delete because I may never need it again. But right now I’m not hurting for space so I keep it there, on line and accessible.

My current setup

  • One ESX virtual switch on the internet that has two systems on it – a bridging OpenBSD firewall, and a Xangati system sniffing packets(still playing with Xangati). No IP addresses are used here.
  • One ESX virtual switch for one internal network, the bridging firewall has another interface here, and my main two internet facing servers have interfaces here, my firewall has another interface here as well for management. Only public IPs are used here.
  • One ESX virtual switch for another internal network for things that will never have public IP addresses associated with them, I run NAT on the firewall(on it’s 3rd/4th interfaces) for these systems to get internet access.

I have a site to site OpenVPN connection between my OpenBSD firewall at home and my OpenBSD firewall on the ESX system, which gives me the ability to directly access the back end, non routable network on the other end.

Normally I wouldn’t deploy an independent firewall, but I did in this case because, well I can. I do like OpenBSD’s pf more than iptables(which I hate), and it gives me a chance to play around more with pf, and gives me more freedom on the linux end to fire up services on ports that I don’t want exposed and not have to worry about individually firewalling them off, so it allows me to be more lazy in the long run.

I bought the server before I moved, once I got to the bay area I went and picked it up and kept it over a weekend to copy my main data set to it then took it back and they hooked it up again and I switched my systems over to it.

The server was about $2900 w/1 year of support, and co-location is about $100/mo. So disk space alone the first year(taking into account cost of the server) my cost is about $0.09 per GB per month (3.6TB), with subsequent years being $0.033 per GB per month (took a swag at the support cost for the 2nd year so that is included). That doesn’t even take into account the virtual machines themselves and the cost savings there over any cloud. And I’m giving the cloud the benefit of the doubt by not even looking at the cost of bandwidth for them just the cost of capacity. If I was using the cloud I probably wouldn’t allocate all 3.6TB up front but even if you use 1.8TB which is about what I’m using now with my VMs and stuff the cost still handily beats everyone out there.

What’s the most crazy is I lack the purchasing power of any of these clouds out there, I’m just a lone consumer, that bought one server. Granted I’m confident the vendor I bought from gave me excellent pricing due to my past relationship, though probably still not on the scale of the likes of Rackspace or Amazon and yet I can handily beat their costs without even working for it.

What surprised me most during my trips doing cost analysis of the “cloud” is how cheap enterprise storage is. I mean Terremark charges $0.25/GB per month(on SATA powered 3PAR arrays), Rackspace charges $0.15/GB per month(I believe Rackspace just uses DAS). I kind of would of expected the enterprise storage route to cost say 3-5x more, not less than 2x. When I was doing real enterprise cloud pricing storage for the solution I was looking for typically came in at 10-20% of the total cost, with 80%+ of the cost being CPU+memory. For me it’s a no brainier – I’d rather pay a bit more and have my storage on a 3PAR of course (when dealing with VM-based storage not bulk archival storage). With the average cost of my storage for 3.6TB over 2 years coming in at $0.06/GB it makes more sense to just do it myself.

I just hope my new server holds up, my last one lasted a long time, so I sort of expect this one to last a while too, it got burned in before I started using it and the load on the box is minimal, would not be too surprised if I can get 5 years out of it – how big will HDDs be in 5 years?

I will miss Terremark because of the reliability and availability features they offer, they have a great service, and now of course are owned by Verizon. I don’t need to worry about upgrading vSphere any time soon as there’s no reason to go to vSphere 5. The one thing I have been contemplating is whether or not to put my vSphere management interface behind the OpenBSD firewall(which is a VM of course on the same box). Kind of makes me miss the days of ESX 3, when it had a built in firewall.

I’m probably going to have to upgrade my cable internet at home, right now I only have 1Mbps upload which is fine for most things but if I’m doing off site backups too I need more performance. I can go as high as 5Mbps with a more costly plan. 50Meg down 5 meg up for about $125, but I might as well go all in and get 100meg down 5 meg up for $150, both plans have a 500GB cap with $0.25/GB charge for going over. Seems reasonable. I certainly don’t need that much downstream bandwidth(not even 50Mbps I’d be fine with 10Mbps), but really do need as much upstream as I can get. Another option could be driving a USB stick to the co-lo, which is about 35 miles away, I suppose that is a possibility but kind of a PITA still given the distance, though if I got one of those 128G+ flash drives it could be worth it. I’ve never tried hooking up USB storage to an ESX VM before, assuming it works? hmmmm..

Another option I have is AT&T Uverse, which I’ve read good and bad things about – but looking at their site their service is slower than what I can get through my local cable company (which truly is local, they only serve the city I am in). Another reason I didn’t go with Uverse for TV is due to the technology they are using I suspected it is not compatible with my Tivo (with cable cards). Though AT&T doesn’t mention their upstream speeds specifically I’ll contact them and try to figure that out.

I kept the motherboard/cpus/ram from my old server, my current plan is to mount it to a piece of wood and hang it on the wall as some sort of art. It has lots of colors and little things to look at, I think it looks cool at least. I’m no handyman so hopefully I can make it work. I was honestly shocked how heavy the copper(I assume) heatsinks were, wow, felt like 1.5 pounds a piece, massive.

While my old server is horribly obsolete, one thing it does have even on my new server is being able to support more ram. Old server could go up to 24GB(I had a max of 6GB at the time in it), new server tops out at 8GB (have 8GB in it). Not a big deal as I don’t need 24GB for my personal stuff but just thought it was kind of an interesting comparison.

This blog has been running on the new server for a couple of weeks now. One of these days I need to hook up some log analysis stuff to see how many dozen hits I get a month.

If Terremark could fix three areas of their vCloud express service – one being resource pool-based billing,  another being relaxing the costs behind opening multiple ports in the firewall (or just giving 1:1 NAT as an option), and the last one being thin provisioning friendly billing for storage — it would really be a much more awesome service than it already is.

January 31, 2011

Terremark snatched by Verizon

Filed under: General,Virtualization — Tags: , — Nate @ 9:34 pm

Sorry for my three readers out there for not posting recently I’ve been pretty busy! And to me there hasn’t been too much events in the tech world in the past month or so that have gotten me interested enough to write about them.

One recent event that did was Verizon’s acquisition of Terremark, a service I started using about a year ago.

I was talking with a friend of mine recently he was thinking about either throwing a 1U server in a local co-location or play around with one of the cloud service providers. Since I am doing both still (been too lazy to completely move out of the co-lo…) I gave him my own thoughts, and it sort of made me think about more about the cloud in general.

What do I expect from a cloud?

When I’m talking cloud I’m mainly referring to the IaaS or Infrastructure as a Service. Setting aside cost modelling and stuff for  a moment here I expect the IaaS to more or less just work. I don’t want to have to care about:

  • Power supply failure
  • Server failure
  • Disk drive failure
  • Disk controller failure
  • Scheduled maintenance (e.g. host server upgrades either software or hardware, or fixes etc)
  • Network failure
  • UPS failure
  • Generator failure
  • Dare I say it ? A fire in the data center?
  • And I absolutely want to be able to run what ever operating system I want, and manage it the same way I would manage it if it was sitting on a table in my room or office. That means boot from an ISO image and install like I would anything else.

Hosting it yourself

I’ve been running my own servers for my own personal use since the mid 90s. I like the level of control it gives me and the amount of flexibility I have with running my own stuff. Also gives me a playground on the internet where I can do things. After multiple power outages over the first part of the decade, one of which lasted 28 hours, and the acquisition of my DSL provider for the ~5th time, I decided to go co-lo. I already had a server and I put it in a local, Tier 2 or Tier 3 data center. I could not find a local Tier 4 data center that would lease me 1U of space. So I lacked:

  • Redundant Power
  • Redundant Cooling
  • Redundant Network
  • Redundant Servers (if my server chokes hard I’m looking at days to a week+ of downtime here)

For the most part I guess I had been lucky, the facility had one, maybe two outages since I moved in about three years ago. The bigger issue with my server was aging and the disks were failing, it was a pain to replace them and it wasn’t going to be cheap to replace the system with something modern and capable of running ESXi in a supported configuration(my estimates put the cost at a minimum of $4k). Add to that  the fact that I need such a tiny amount of server resources.

Doing it right

So I had heard of Terremark from my friends over at 3PAR, and you know I like 3PAR, and they use Vmware and I like Vmware. So I decided to go with them rather than the other providers out there, they had a decent user interface and I got up and going fairly quickly.

So I’ve been running it for almost a year, with pretty much no issues, I wish they had a bit more flexibility in the way they provision networking stuff but nothing is perfect (well unless you have the ability to do it yourself).

From a design perspective, Terremark has done it right, whether it’s providing an easy to use interface to provision systems, using advanced technology such as VMware, 3PAR, and Netscaler load balancers, and building their data centers to be even — fire proof.

Having the ability to do things like Vmotion, or Storage vMotion is just absolutely critical for a service provider, I can’t imagine anyone being able to run a cloud without such functionality at least with a diverse set of customers. Having things like 3PAR’s persistent cache is critical as well to keep performance up in the event of planned or unplanned downtime in the storage controllers.

I look forward to the day where the level of instrumentation and reporting in the hypervisors allow billing based on actual usage, rather than what is being provisioned up front.

Sample capabilities

In case your a less technical user I wanted to outline a few of the abilities the technology Terremark uses offers their customers –

Memory Chip Failure (or any server component failure or change)

Most modern servers have sensors on them and for the most part are able to accurately predict when a memory chip is behaving badly and to warn the operator of the machine to replace it. But unless your running on some very high end specialized equipment (which I assume Terremark is not because it would cost too much for their customers to bare), the operator needs to take the system off line in order to replace the bad hardware. So what do they do? They tell VMware to move all of the customer virtual machines off the affected server onto other servers, this is done without customer impact, the customer never knows this is going on. The operator can then take the machine off line and replace the faulty components and then reverse the process.

Same applies to if you need to:

  • Perform firmware or BIOS updates/changes
  • Perform Hypervisor updates/patches
  • Maybe your retiring an older type of server and moving to a more modern system

Disk failure

This one is pretty simple, a disk fails in the storage system and the vendor is dispatched to replace it, usually within four hours. But they may opt to wait a longer period of time for whatever reason, with 3PAR it doesn’t really matter, there are no dedicated hot spares so your really in no danger of losing redundancy, the system rebuilds quickly using a many:many RAID relationship, and is fully redundant once again in a matter of hours(vs days with older systems and whole-disk-based RAID).

Storage controller software upgrade

There are fairly routine software upgrades on modern storage systems, the software feature set seems to just grow and grow. So the ability to perform the upgrade without disrupting users for too long(maybe a few seconds) is really important with a diverse set of customers, because there will probably be no good time where all customers say ok I have have some downtime. So having high availability storage with the ability to maintain performance with a controller being off line by mirroring the cache elsewhere is a very useful feature to have.

Storage system upgrade (add capacity)

Being able to add capacity without disruption and dynamically re-distribute all existing user data across all new as well as current disk resources on-line to maximize performance is a boon for customers as well.

UPS failure (or power strip/PDU failure)

Unlike the small dinky UPS you may have in your house or office UPSs in data centers typically are powering up to several hundred machines, so if it fails then you may be in for some trouble. But with redundant power you have little to worry about, the other power supply takes over without interruption.

If a server power supply blows up it has the ability to take out the entire branch or even whole circuit that it’s connected to. But once again redundant power saves the day.

Uh-oh I screwed up the network configuration!

Well now you’ve done it, you hosed the network (or maybe for some reason your system just dropped off the network maybe flakey network driver or something) and you can’t connect to your system via SSH or RDP or whatever you were using. Fear not, establish a VPN to the Terremark servers and you can get console access to your system. If only the console worked from Firefox on Linux..can’t have everything I guess. Maybe they will introduce support for vSphere 4.1’s virtual serial concentrators soon.

It just works

There are some applications out there that don’t need the level of reliability that the infrastructure Terremark uses can provide and they prefer to distribute things over many machines or many data centers or something, that’s fine too, but most apps, almost all apps in fact make the same common assumption, perhaps you can call it the lazy assumption – they assume that it will just work. Which shouldn’t surprise many, because achieving that level of reliability at the application layer alone is an incredibly complex task to pull off. So instead you have multiple layers of reliability under the application handling a subset of availability, layers that have been evolving for years or decades even in some cases.

Terremark just works. I’m sure there are other cloud service providers out there that work too, I haven’t used them all by any stretch(nor am I seeking them for that matter).

Public clouds make sense, as I’ve talked about in the past for a subset of functionality, they have a very long ways to go in order to replace what you can build yourself in a private cloud (assuming anyone ever gets there). For my own use case, this solution works.

May 3, 2010

Terremark vCloud Express: First month

Filed under: Datacenter,Virtualization — Tags: , , — Nate @ 3:02 pm

Not much to report, got my first bill for my first “real” month of usage (minus DNS I haven’t gotten round to transferring DNS yet but I do have the ports opened).

$122.20 for the month which included:

  • 1 VM with 1VPU/1.5GB/40GB – $74.88
  • 1 External IP address – $0.00 (which is confusing I thought they charged per IP)
  • TCP/UDP ports – $47.15
  • 1GB of data transferred – $0.17

Kind of funny the one thing that is charged as I use it (the rest being charged as I provision it) I pay less than a quarter for. Obviously I slightly overestimated my bandwidth usage. And I’m sure they round to the nearest GB, as I don’t believe I even transferred 1GB during the month of April.

I suppose the one positive thing from a bandwidth and cost standpoint if I ever wanted to route all of my internet traffic from my cable modem at home through my VM (over VPN) for paranoia or security purposes, I could. I believe Comcast caps bandwidth at ~250GB/mo or something which would be about $42/mo assuming I tapped it out(but believe me my home bandwidth usage is trivial as well).

Hopefully this coming weekend I can get around to assigning a second external IP, mapping it to my same DNS and moving some of my domains over to this cloud instead of keeping them hosted on my co-located server. Just been really busy recently.

April 3, 2010

Terremark vCloud Express: Day 1

Filed under: Virtualization — Tags: , , — Nate @ 12:19 pm

You may of read another one of my blog entries “Why I hate the cloud“, I also mentioned how I’ve been hosting my own email/etc for more than a decade in “Lesser of two evils“.

So what’s this about? I still hate the cloud for any sort of large scale deployment, but for micro deployments it can almost make sense. Let me explain my situation:

About 9 years ago the ISP I used to help operate more or less closed shop, I relocated what was left of the customers to my home DSL line (1mbps/1mbps 8 static IPs) on a dedicated little server. My ISP got bought out, then got bought out again and started jacking up the rates(from $20/mo to ~$100/mo + ~$90;/mo for Qwest professional DSL). Hosting at my apartment was convienant but at the same time was a sort of a ball and chain, as it made it very difficult to move. Co-ordinating the telco move and the ISP move with minimal downtime, well let’s just say with DSL that’s about impossible. I managed to mitigate one move in 2001 by temporarily locating my servers at my “normal” company’s network for a few weeks while things got moved.

A few years ago I was also hit with what was a 27 hour power outage(despite being located in a down town metropolitan area, everyone got hit by that storm). Shortly after that I decided longer term a co-location is the best fit for me. So phase one was to virtualize the pair of systems in VMware. I grabbed an older server I had laying around and did that, and ran it for a year, worked great(though the server was really loud).

Then I got another email saying my ISP was bought out yet again, this time the company was going to force me to change my IP addresses, which when your hosting your own DNS can be problematic. So that was the last straw. I found a nice local company to host my server at a reasonable price. The facility wasn’t world class by any stretch, but the world class facilities in the area had little interest in someone wanting to host a single 1U box that averages less than 128kbps of traffic at any given time. But it would do for now.

I run my services on a circa 2004 Dual Xeon system, with 6GB memory, ~160GB of disk on a 3Ware 8006-2 RAID controller(RAID 1). I absolutely didn’t want to go to one of those cheap crap hosting providers where they have massive downtime and no SLAs. I also had absolutely no faith in the earlier generation “UML” “VMs(yes I know Xen and UML aren’t the same but I trust them the same amount – e.g. none). My data and privacy are fairly important to me and I am willing to pay extra to try to maintain it.

So early last year my RAID card told me one of my disks was about to fail and to replace it, so I did, rebuilt the array and off I went again. A few months later the RAID card again told me another disk was about to fail(there are only two disks in this system), so I replaced that disk, rebuilt, and off I went. Then a few months later, the RAID card again said a disk is not behaving right and I should replace it. Three disk replacements in less than a year. Though really it’s been two, I’ve ignored the most recent failing drive for several months now. Media scans return no errors, however RAID integrity checks always fail causing a RAID rebuild(this happens once a week). Support says the disk is suffering from timeouts.  There is no back plane on the system(and thus no hot swap, making disk replacements difficult). Basically I’m getting tired of maintaining hardware.

I looked at the cost of a good quality server with hot swap, remote management, etc, and something that can run ESX, cost is $3-5k. I could go $2-3k and stick to VMware server on top of Debian, a local server manufacturer has their headquarters literally less than a mile from my co-location, so it is tempting to stick with doing it on my own, and if my needs were greater than I would fo sure, cloud does not make sense in most cases in my opinion but in this case it can.

If I try to price out a cloud option that would match that $3-5k server, purely from a CPU/memory perspective the cloud option would be significantly more. But I looked closer and I really don’t need that much capacity for my stuff. My current VMware host runs at ~5-8% cpu usage on average on six year old hardware. I have 6GB of ram but I’m only using 2-3GB at best. Storage is the biggest headache for me right now hosting my own stuff.

So I looked to Terremark who seem to have a decent operation going, for the most part they know what they are doing(still make questionable decisions though I think most of those are not made by the technical teams). I looked to Terremark for a few reasons:

  • Enterprise storage either from 3PAR or EMC (storage is most important for me right now given my current situation)
  • Redundant networking
  • Tier IV facilities (my current facility lacks true redundant power and they did have a power outage last year)
  • Persistent, fiber attached storage, no local storage, no cheap iSCSI, no NFS,  no crap RAID controllers, no need to worry about using APIs and other special tools to access storage it is as if it was local
  • Fairly nice user interface that allows me to self provision VMs, IPs etc

Other things they offer that I don’t care about(for this situation, others they could come in real handy):

  • Built in load balancing via Citrix Netscalers
  • Built in firewalls via Cisco ASAs

So for me, a meager configuration of 1 vCPU, 1.5GB of memory, and 40GB of disk space with a single external static IP is a reasonable cost(pricing is available here):

  • CPU/Memory: $65/mo [+$1,091/mo if I opted for 8-cores and 16GB/ram]
  • Disk space: $10/mo [+$30/mo if I wanted 160GB of disk space]
  • 1 IP address: $7.20/mo
  • 100GB data transfer: $17/mo (bandwidth is cheap at these levels so just picked a round number)
  • Total: $99/mo

Which comes to about the same as what I’m paying for in co-location fees now, if that’s all the costs were I’d sign up in a second, but unfortunately their model has a significant premium on “IP Services”, when ideally what I’d like is just a flat layer 3 connection to the internet. The charge is $7.20/mo for each TCP and UDP port you need opened to your system, so for me:

  • HTTP – $7.20/mo
  • HTTPS – $7.20/mo
  • SMTP – $7.20/mo
  • DNS/TCP – $7.20/mo
  • DNS/UDP – $7.20/mo
  • VPN/UDP – $7.20/mo
  • SSH – $7.20/mo
  • Total: $50/mo

And I’m being conservative here, I could be opening up:

  • POP3
  • POP3 – SSL
  • IMAP4
  • IMAP4 – SSL
  • Identd
  • Total: another $36/mo

But I’m not, for now I’m not. Then you can double all of that for my 2nd system, so assuming I do go forward with deploying the second system my total costs (including those extra ports) is roughly $353/mo (I took out counting a second 100GB/mo of bandwidth). Extrapolate that out three years:

  • First year: $4,236 ($353/mo)
  • First two years: $8,472
  • First three years: $12,708

Compared to doing it on my own:

  • First year: ~$6,200 (with new $5,000 server)
  • First two years: ~$7,400
  • First three years: ~$8,600

And if you really want to see how this cost structure doesn’t scale, let’s take a more apples to apples comparison of CPU/memory of what I’d have in my own server and put it in the cloud:

  • First year – $15,328 [ 8 cores, 16GB ram 160GB disk ]
  • First two years – $30,657
  • First three years – $45,886

As you can see the model falls apart really fast.

So clearly it doesn’t make a lot of sense to do all of that at once, so if I collapse it to only the essential services on the cloud side:

  • First year: $3,420 ($270/mo)
  • First two years: $6,484
  • First three years: $9,727

I could live with that over three years, especially if the system is reliable, and maintains my data integrity. But if they added just one feature for lil ol me, that feature would be a “Forwarding VIP” on their load balancers and say basically just forward everything from this IP to this internal IP. I know their load balancers can do it, it’s just a matter of exposing the functionality. This would dramatically impact the costs:

  • First year: $2,517 ($210/mo)
  • First two years: $5,035
  • First three years: $7,552
  • First four years: $10,070

You can see how the model doesn’t scale, I am talking about 2 vCPUs  worth of power, and 3GB of memory, compared to say at least a 8-12 core physical server and 16GB or more of memory if I did it myself. But again I have no use for that extra capacity if I did it myself so it’d just sit idle, like it does today.

CPU usage is higher than I mentioned above I believe because of a bug in VMware Server 2.0 that causes CPU to “leak” somehow, which results in a steady, linear increase in cpu usage over time. I reported it to the forums, but didn’t get a reply, and don’t care enough to try to engage VMware support, they didn’t help me much with ESX and a support contract, they would do even less for VMware server and no support contract.

I signed up for Terremark’s vCloud Express program a couple of months ago, installed a fresh Debian 5.0 VM, and synchronized my data over to it from one of my existing co-located VMs.

So today I have officially transferred all of my services(except DNS) from one of my two co-located VMs to Terremark, and will run it for a while and see how the costs are, how it performs, reliability etc. My co-location contract is up for renewal in September so I have plenty of time to determine whether or not I want to make the jump, I’m hoping I can make it work, as it will be nice to not have to worry about hardware anymore. An excerpt from that link:

[..] My pager once went off in the middle of the night, bringing me out of an awesome dream about motorcycles, machine guns, and general ass-kickery, to tell me that one of the production machines stopped responding to ping. Seven or so hours later, I got an e-mail from Amazon that said something to the effect of:

There was a bad hardware failure. Hope you backed up your shit.

Look at it this way: at least you don’t have a tapeworm.

-The Amazon EC2 Team

I’ll also think long and hard, and probably consolidate both of my co-located VMs into a single VM at Terremark if I do go that route, which will save me a lot, I really prefer two VMs, but I don’t think I should be charged double for two, especially when two are going to use roughly the same amount of resources as one. They talk all about “pay for what you use”, when that is not correct, the only portion of their service that is pay for what you use is bandwidth. Everything else is “pay as you provision”. So if you provision 100GB and a 4CPU VM but you never turn it on, well your still going to pay for it.

The model needs significant work, hopefully it will improve in the future, all of these cloud companies are trying to figure out this stuff still. I know some people at Terremark and will pass this along to them to see what they think. Terremark is not alone in this model, I’m not picking on them for any reason other than I use their services. I think in some situations it can make sense. But the use cases are pretty low at this point. You probably know that I wouldn’t sign up and commit to such a service unless I thought it could provide some good value!

Part of the issue may very well be limitations in the hypervisor itself with regards to reporting actual usage, as VMware and others improve their instrumentation of their systems that could improve the cost model for customers signficantly, perhaps doing things like charging based on CPU usage based on a 95% model like we measure bandwidth. And being able to do things like cost capping, where if your resource usage is higher for an extended period the provider can automatically throttle your system(s) to keep your bill lower(at your request of course).

Another idea would be more accurate physical to virtual mapping, where I can provision say 1 physical CPU, and X amount of memory and then provision unlimited VMs inside that one CPU core and memory. Maybe I just need 1:1, or maybe my resource usage is low enough that I can get 5:1 or 10:1, afterall one of the biggest benefits of virtualization is being able to better isolate workloads. Terremark already does this to some degree on their enterprise products, but this model isn’t available for vCloud Express, at least not yet.

You know what surprised me most next to the charges for IP services, was how cheap enterprise storage is for these cloud companies. I mean $10/mo for 40GB of space on a high end storage array? I can go out and buy a pretty nice server to host VMs at a facility of my choosing, but if I want a nice storage array to back it up I’m looking at easily 10s of thousands of dollars. I just would of expected storage to be a bigger piece of the pie when it came to overall costs. When in my case it can be as low as 3-5% of the total cost over a 3 year period.

And despite Terremark listing Intel as a partner, my VM happens to be running on -you guessed it – AMD:

yehat:/var/log# cat /proc/cpuinfo
processor    : 0
vendor_id    : AuthenticAMD
cpu family    : 16
model        : 4
model name    : Quad-Core AMD Opteron(tm) Processor 8389
stepping    : 2
cpu MHz        : 2913.037

AMD get’s no respect I tell ya, no respect! 🙂

I really want this to work out.

Powered by WordPress