IBM InfoSphere Warehouse pricing, packaging, compression and more
IBM InfoSphere Warehouse 9.7.3 has been announced, and is planned for general availability late this month. IBM InfoSphere Warehouse is, in essence, DB2-plus, where the “plus” comprises:
- DPF (Data Partitioning Feature) — i.e., the ability to do shared-nothing scale-out.
- Unimportant add-ons — e.g., a mere 5 seats of the Cognos BI tool.
The main news in this release of InfoSphere Warehouse is probably pricing. While IBM has long had a funky server-power-based pricing scheme, it is now adding per-terabyte pricing, with a twist: IBM InfoSphere Warehouse now can be bought per terabyte of compressed user data. Specifically:
- IBM InfoSphere Warehouse 9.7.3 Enterprise Edition can be bought for production for $70K or so per terabyte of compressed user data.
- IBM InfoSphere Warehouse 9.7.3 Departmental Edition can be bought for production for $35K or so per terabyte of compressed user data.
- Development/test seats of IBM InfoSphere Warehouse cost about $2K per user.
- High availability/disaster recovery instances are priced as if they were managing 1 TB each — unless, of course, you have an active-active configuration, in which case they’re priced according to their full amount of data.
Per-terabyte pricing is generally a good way to think about analytic DBMS costs, for at least two reasons:
- Most buyers find it easier to estimate how much data they’ll have than, say, how much use they’ll make of it..
- Per-terabyte software pricing lets users pick the amount of hardware they need for performance, without getting clobbered by extra software license fees.
Vendors often complain that per-terabyte pricing obscures differences in performance or quality, but I am confident they can (continue to) meet the marketing challenges that result.
The choice between pre- and post-compression per-terabyte pricing is interesting. Pre-compression pricing is more in line with what users can measure. Post-compression pricing is more in line with the cost of any appliances they might buy.
And by the way — in a market with rampant discounting, list prices can be somewhat theoretical anyhow.
This seems like an appropriate place to add some notes on DB2/InfoSphere Warehouse compression.
- I described the basic DB2/InfoSphere Warehouse compression scheme in a post last year.
- Historically, DB2 compression has been a chargeable option.
- IBM’s compression algorithms generally change at the time of major releases, when it’s appropriate to also change pricing. (If IBM doesn’t change pricing whenever it changes compression algorithms, this whole pricing scheme could get weird fast.)
- IBM is proud of compressing not just data, but also indexes and temp space. IBM fondly believes that this makes its overall compression more effective than Oracle Exadata hybrid columnar compression. The two ways IBM compresses rows are:
- Compressing RIDs (Row IDs). These are always 6 bytes. Since no known IBM customer has a table with >2^47 rows, that’s compressible.
- Automatic prefix compression.
- For data only, Tim Vincent of IBM believes 3X is a good conservative number for estimating compression ratios, at least in a warehouse use case. However, the IBMers who briefed me on InfoSphere Warehouse 9.7.3 used a lesser figure of 55% compression, which is a lot like 2.2X. Everybody seems to agree that compression figures on individual data sets can range from the low 40s to the mid-high 80s.
- Using even Tim Vincent’s figure, $70K/TB of compressed data sounds roughly like $25K/TB of uncompressed data …
- … which is still well more than IBM Netezza charges for hardware and software together.*
*The $20K/TB Netezza price point has been much reduced by Netezza’s improvements in compression ratios.
Returning to the actual IBM InfoSphere Warehouse 9.7.3 release — InfoSphere Warehouse can now be virtualized on VMware. Well, actually, it could be before, but while DB2 certainly has been, my briefers couldn’t think of any examples of InfoSphere Warehouse running on VMware. So perhaps there is something newer/easier about this version.
One last note — active-active replication was mentioned above. It turns out that the current way to do active-active replication in DB2/InfoSphere Warehouse is via a kind of queue-based implementation. IBM is working on making that easier to implement — i.e. on a par with high availability/disaster recovery — in a further release.
Comments
One Response to “IBM InfoSphere Warehouse pricing, packaging, compression and more”
Leave a Reply
Replication Server – Q Replication comes bundled with InfoSphere Warehouse and has WebSphere MQ to take log data changes from source to apply them to a target. It already supports high availability/disaster recovery with low latency.
There is also InfoSphere CDC (formerly DataMirror) is the replication option that supports other database platforms. Has an option for InfoSphere Warehouse where you license it for the Warehouse DB and use it for unlimited data sources. Active/active replication of data into the Warehouse from Oracle, SQL Server, DB2 etc. I think guaranteed delivery of data into the warehouse is even more important than HA.