I noticed a few days ago that IBM posted some new SPC-1 results based on their SVC system, this time using different back end storage- their Storwize product (something I had not heard of before but I don't pay too close of attention to what IBM does they have so many things it's hard to keep track).
The performance results are certainly very impressive, coming in at over 520,000 IOPS at a price of $6.92 per IOP. This is the sort of results I was kind of expecting from the Hitachi VSP a while back. IBM tested with 1,920 drives the same number as the 3PAR V800. They bested the 3PAR performance by a good 70,000 IOPS with half the latency on the same number of disks and less data cache.
The capacity numbers were, and still are, sort of difficult to interpret they seem to give conflicting information. IBM is using ~138TB of disk space to protect ~99TB of disk space. While 3PAR is using ~263TB of disk space to protect ~263TB of disk space. Both results say there is 30TB+ of "unused storage" in that protection scheme.
Bottom line is the IBM box is presented with roughly 280TB of storage, and of that, 100TB is usable, or about 35%. That brings their cost per usable TB number to $36,881/TB vs the 3PAR V800 which is roughly $12,872. The V800 I/O cost was $6.59, which IBM comes real close to.
IBM has apparently gone the same route as HDS in the only 3.5" drives they support on their Storwize systems are 3TB SATA disks. They hamper their own cost structure by not supporting larger 3.5" 15k RPM SAS disks, which just doesn't make sense to me. There are 300GB 15k SAS drives out and Storwize doesn't support those either(yet at least).
It took about five pages of scripting to configure the system from the looks of the full disclosure report.
Certainly looks like a halfway decent system. I mean if you compare it to the VSP for example it has the same array virtualization abilities with the SVC, it is sporting almost double the amount of disk drives, almost double the raw performance, configuration at least appears to be less complicated. It uses those power efficient 2.5" disks just like the VSP. It also costs quite a bit less than the VSP both on per-IOP and per-TB basis. It also appears to have mainframe support for those that need that. From the looks of Seagate 15k RPM disks at least the 2.5" drives have an average of 15% less latency for random reads and writes than their 3.5" counterparts. I thought the difference might be bigger than that given how much less distance the disk heads have to travel.
If I was in the market for such a big system, these results wouldn't lead me away from 3PAR, at least based on the pricing disclosed of each system (and level of complexity to configure). I was interviewing a candidate a few weeks ago and this guy had a strong storage background. Having worked for Symantec I think for a while he was doing some sort of consulting at various companies for storage. I asked him how he provisioned storage, what his strategies were. His response was quite surprising. He said usually the vendors come out, deploy their systems and provision everything up front, all he does is carve out LUNs and present them to users. He had never been involved in the architecture planning or deployment of a storage system. He acted as if what he was doing was the standard practice (maybe it is at large companies I've never worked at such an organization), and that it was perfectly normal.
But it certainly seems like a good system when put up against at least the VSP, and probably the V-MAX too.
I've always been interested in the SVC by itself, certainly seems like a cool concept, I've never used one of course but having the ability to cluster at that intermediate level(in this case a 8-node cluster which may be the max I'm not sure) and then scale out storage behind it. Clearly they've shown with this you can pump one hell of a lot of I/O through the thing. They also seem to have SSD tiering support built into it which is nice as well.
Hopefully HP can come up with something similar at some point, as much as they talk smack about the likes of SVC today.
[UPDATED - as usual I re-read my posts probably 30 times after I post them and refine them a bit if needed, this one got quite a few changes. I don't run a newspaper here so I don't aim to have a completely thought out article when I hit post for the first time]
SPC-2 is a sequential throughput test, more geared towards things like streaming media and data warehousing instead of random I/O which represents a more typical workload.
The numbers are certainly very impressive though, coming in at 7.3 gigabytes/second, besting most other systems out there, 42 megabytes/second per disk, IBM's earlier high end storage array was only able to inch out 12 megabytes/second per disk(with 4 times the number of disks) with disks that were twice as fast. So at least 8 times the I/O capacity, for only about 25% more performance vs XIV, that's a stark contrast!
SATA/Nearline/7200RPM SAS disks are typically viewed as good at sequential operations, though I would expect 15k RPM disks to do at least as well, since the faster RPM should result in more data traveling under the head at a faster rate, perhaps a sign of a good architecture in XIV with it's distributed mirrored RAID.
While the results are quite good - again it doesn't represent the most common types of workloads out there which is random I/O.
The $1.1M discounted price of the system seems quite high for something that only has 180 disks on it(discounts on the system seem to for the most part be 70%), though there is more than 300 gigabytes of cache. I bought a 2-node 3PAR T400 with 200 SATA disks shortly after the T was released in 2008 for significantly less, of course it only had 24GB of data cache!
I hope the $300 modem IBM(after 70% discount) is using is a USR Courier! (Your Price: $264.99 - still leaves a good profit for IBM). Such fond memories of the Courier.
I can only assume at this point of time IBM has refrained from posting SPC-1 results is because with a SATA-only system the results would not be impressive. In a fantasy world with nearline disks and a massive 300GB cache maybe they could achieve 200-250 IOPS/disk which would put the $1.1M system with 180 disks 36,000 - 45,000 SPC-1 IOPS, or $24-30/IOP.
A more realistic number is probably going to be 25,000 or less($44/IOP), making it one of the most expensive systems out there for I/O (even if it could score 45,000 SPC-1). 3PAR would do 14,000 IOPS (not SPC-1 IOPS mind you, SPC-1 number would probably be lower) with 180 SATA disks and RAID 10 by contrast, based on their I/O calculator with 80% read/20% write workload for about 50% less cost(after discounts) for a 4-node F400.
One of the weak spots on 3PAR is the addressable capacity per controller pair, for I/O and disk connectivity purposes a 2-node F200 (much cheaper) could easily handle 180 2TB SATA disks, but from a software perspective that is not the case. I have been complaining about this for more than 3 years now, they've finally addressed it to some extent in the V-class but I am still disappointed to the extent it has been addressed per the supported limits(1.6PB, should be more than double that) that exist today, but at least with the V they have enough memory on the box to scale it up with software upgrades(time will tell if such upgrades come about however).
I would not even use a F400 for this if it was me opting instead for a T800 (800TB) or a V class(800-1600TB), because with 360TB raw on the system that is very close to the limit of the F400's addressable capacity (384TB), or the T400(400TB). You could of course get a 4-node T800(or a 2-node V400 or V800) to start, then add additional controllers to get beyond 400TB of capacity if/when the need arises. With the 4-controller design you also get the wonderful persistent cache feature built in (one of the rare software features that is not separately licensed).
But for this case, comparing a nearly maxed out F400 against a maxed out XIV is still fair - it is one of the main reasons I did not consider XIV during my last couple storage purchases.
So there is a strong use case of when to use XIV with these results - throughput oriented workloads! The XIV would absolutely destroy the F400 in throughput, which tops out at 2.6GB/sec (to disk).
With software such as Vertica out there which slashes the need for disk I/O on data warehouses given it's advanced design, and systems such as Isilon being so geared towards things like scale out media serving (using NFS for media serving seems like a more ideal protocol anyways), I can't help but wonder what XIV's place is in the market, at this price point at least. It does seem like a very nice platform from a software perspective, and with their recent switch to Infiniband from 1 Gigabit ethernet a good part of their hardware has been improved as well, also it has SSD read cache coming.
I will say though that this XIV system will handily beat even a high end 3PAR T800 for throughput. While 3PAR has never released SPC-2 numbers the T800 tops out at a 6.4 gigabytes/second(from disk), and it's quite likely it's SPC-2 results would be lower than that.
With the 3PAR architecture being as optimized as it is for random I/O I do believe it would suffer vs other platforms with sequential I/O. Not that the 3PAR would run slow, but it would quite likely run slower due to how data is distributed on the system. That is just speculation though a result of not having real numbers to base it on. My own production random I/O workloads in the past have had 15k RPM disks running in the range of 3-4MB/second(numbers are extrapolated as I have only had SATA and 10k RPM disks in my 3PAR arrays to-date though my new one that is coming is 15k RPM), as such with a random I/O workload you can scale up pretty high before you run into any throughput limits on the system (in fact if you max out a T800 with 1,280 drives you could do as high as 5MB/second/disk before you would hit the limit). Though XIV is distributed RAID too so who knows..
Likewise I suspect 3PAR/HP have not released SPC-2 numbers because it would not reflect their system in the most positive light, unlike SPC-1.
Sorry for the tangents on 3PAR
So according to this article from our friends at The Register, Compellent is considering going to absurdly efficient storage tiering taking the size of data being migrated to 32kB from their currently insanely efficient 512kB.
That's just insane!
For reference, as far as I know:
- 3PAR moves data around in 128MB chunks
- IBM moves data around in 1GB chunks (someone mentioned that XIV uses 1MB)
- EMC moves data around in 1GB chunks
- Hitachi moves data around in 42MB chunks (I believe this is the same data size they use for allocating storage to thin provisioned volumes)
- NetApp has no automagic storage tiering functionality though they do have PAM cards which they claim is better.
I have to admit I do like Compellent's approach the best here, hopefully 3PAR can follow. I know 3PAR allocates data to think provisioned volumes in 16kB chunks, what I don't know is whether or not their system is adjustable to get down to a more granular level of storage tiering.
There's just no excuse for the inefficient IBM and EMC systems though, really, none.
Time will tell if Compellent actually follows through with going as granular as 32kB, I can't help but suspect the CPU overhead of monitoring so many things will be too much for the system to bear.
Maybe if they had purpose built ASIC...
If you recall not long ago IBM released some SPC-1 numbers with their automagic storage tiering technology Easy Tier. It was noted that they are using 1GB blocks of data to move between the tiers. To me that seemed like a lot.
Still seems like a lot. I was pretty happy when 3PAR said they use 128MB blocks, which is half the size of their chunklets. I thought to myself when I first heard of this sub LUN tiering that you may want a block size as small as, I don't know 8-16MB. At the time 128MB still seemed kind of big(before I had learned of IBM's 1GB size).
Just think of how much time it takes to read 1GB of data off a SATA disk (since the big target for automagic storage tiering seems to be SATA + SSD).
Anyone know what size Compellent uses for automagic storage tiering?
IBM recently announced that they are adding an "easy tier" of storage to some of their storage systems. This seems to be their form of what I have been calling automagic storage tiering. They are doing it at the sub LUN level in 1GB increments. And they recently posted SPC-1 numbers for this new system, finally someone posted numbers.
Configuration of the system included:
- 1 IBM DS8700
- 96 1TB SATA drives
- 16 146GB SSDs
- Total ~100TB raw space
- 256GB Cache
Performance of the system:
- 32,998 IOPS
- 34.1 TB Usable space
Cost of the system:
- $1.58 Million for the system
- $47.92 per SPC-1 IOP
- $46,545 per usable TB
Now I'm sure the system is fairly power efficient given that it only has 96 spindles on it, but I don't think that justifies the price tag. Just take a look at this 3PAR F400 which posted results almost a year ago:
- 384 disks, 4 controllers, 24GB data cache
- 93,050 SPC-1 IOPS
- 26.4 TB Usable space (~56TB raw)
- $548k for the system (I'm sure prices have come down since)
- $5.89 per SPC-1 IOP
- $20,757 per usable TB
The system used 146GB disks, today the 450GB disks seem priced very reasonably, I would opt for those instead and get the extra space for not much of a premium.
Take a 3PAR F400 with 130 450GB 15k RPM disks, that would be about 26TB of usable space with RAID 1+0 (the tested configuration above is 1+0). That would give about 33.8% of the performance of the above 384-disk system, so say 31,487 SPC-1 IOPS, very close to the IBM system and I bet the price of the 3PAR would be close to half of the $548k above (taking into account the controllers in any system are a good chunk of the cost). 3PAR has near linear scalability making extrapolations like this possible and accurate. And you can sleep well at night knowing you can triple your space/performance online without service disruption.
Note of course you can equip a 3PAR system with SSD and use automagic storage tiering as well, they call it Adaptive Optimization, if you really wanted to. The 3PAR system moves data around in 128MB increments by contrast.
It seems the cost of the SSDs and the massive amount of cache IBM dedicated to the system more than offset the benefits of using lower cost nearline SATA disks in the system. If you do that, what's the point of it then?
So consider me not impressed with the first results of automagic storage tiering. I expected significantly more out of it. Maybe it's IBM specific, maybe not, time will tell.
- Expand the amount of memory available to the system
- Be able to "connect" two dual socket blades to form a single quad socket system
Pretty creative, though the end result wasn't quite as impressive as it sounded up front. Their standard blade chassis is 9U and has 14 slots on it.
- Each blade is dual socket, maximum 16 cores, and 16 DIMMs
- Each memory extender offers 24 additional DIMMs
So for the chassis as a whole your talking about 7 dual socket systems with 40 DIMMs each. Or 3 quad socket systems with 80 DIMMs each, and 1 dual socket with 40.
Compared to an Opteron 6100 system, which you can get 8 quad socket systems with 48 DIMMs each in a single enclosure(granted such a system has not been announced yet but I am confident it will be).
- Intel 7500-based system: 112 CPU cures (1.8Ghz), 280 DIMM slots - 9U
- Opteron 6100-based system: 384 CPU cores (2.2Ghz), 384 DIMM slots - 10U
And the price of the IBM system is even less impressive -
In a base configuration with a single four-core 1.86 GHz E7520 processor and 8 GB of memory, the BladeCenter HX5 blade costs $4,629. With two of the six-core 2 GHz E7540 processors and 64 GB of memory, the HX5 costs $15,095.
They don't seem to show pricing for the 8 core 7500-based blade, and say there is no pricing or ETA on the arrival of the memory extenders.
They do say this which is interesting (not surprising) -
The HX5 blade cannot support the top-end eight-core Xeon 7500 parts, which have a 130 watt thermal design point, but it has been certified to support the eight-core L7555, which runs at 1.86 GHz, has 24 MB of L3 cache, and is rated at 95 watts.
I only hope AMD has enough manufacturing capacity to keep up with demand, Opteron 6100s will wipe the floor with the Intel chips on price/performance (for the first time in a while).
One question: Why?
IBM has recently announced a partnership with Red hat to use KVM in a cloud offering. At first I thought, well maybe they are doing it to offer Microsoft applications as well, but that doesn't appear to be the case:
Programmers who use the IBM Cloud for test and dev will be given RHEV to play with Red Hat Enterprise Linux or Novell SUSE Linux Enterprise Server images with a Java layer as they code their apps and run them through regression and other tests.
Seems like a slap in the face to their mainframe division (I never bought into the mainframe/linux/VM marketing myself, I suppose they don't either). I do remember briefly having access to a S390 running a SuSE VM about 10 years ago, it was..interesting.