October 15, 2010
Notes on data warehouse appliance prices
I’m not terribly motivated to do a detailed analysis of data warehouse appliance list prices, in part because:
- Everybody knows that in practice data warehouse appliances tend to be deeply discounted from list price.
- The only realistic metric to use for pricing data warehouse appliances is price-per-terabyte, and people have gotten pretty sick of that one.
That said, here are some notes on data warehouse appliance prices.
- Reasons people criticize per-terabyte data warehouse appliance price metrics include:
- Price-per-terabyte metrics ignore issues of throughput, latency, workload, and so on.
- Price-per-terabyte metrics ignore quality of storage medium (slow disks, fast disks, Flash, etc.)
- Price-per-terabyte metrics can be radically affected by changes in disk size.
- Nonetheless, it is common to discuss data warehouse appliance price/terabyte. When one does, it is common to refer to user data rather than some measure of raw disk capacity.
- Advantages of this approach include:
- User data is what matters.
- User data is what users doing product evaluations or setting budgets can best estimate in advance.
- User data is a reasonable and popular basis for software-only analytic DBMS pricing.
- Disadvantages of this approach include:
- It depends on assumptions about compression (and in some cases indexing and so on), which are highly dependent upon the specifics of the data set.
- Some vendors and users indeed think in terms of raw disk capacity.
- Advantages of this approach include:
- Oracle perhaps excepted, data warehouse appliance vendors tend to be laudably conservative in the compression assumptions they build into their per-terabyte price metrics.
- I wrote last year that Netezza provides the traditional industry benchmark for per-terabyte pricing. When I wrote that, the “Netezza price point” had just become a little under $20,000/TB.
- That was based on 2.25X compression. Since then, Netezza has upgraded its compression. Netezza now quotes 4X compression. Accordingly, Netezza’s list price is now around $11,000/TB. (A little below, actually, per Phil Francisco.)
- As Doug Henschen reports, the EMC Greenplum Data Computing Appliance starts at $1 million for 18 terabytes of uncompressed user data. EMC/Greenplum also cites a 4x compression figure. That all works out to the vicinity of $14,000/TB.
- And by the way, if you mirror your data on a SAN, you can stuff twice as much into the EMC Greenplum Data Computing Appliance as otherwise, but then you also have to pay for 36 TB of capacity per half-rack appliance on a SAN.
- Eric Guyer reminded us that Oracle Exadata has high list prices. He also reminded us that Oracle Exadata is apt to be deeply discounted.
- A couple of versions ago, I outlined the complexities of Exadata pricing.
Categories: Data warehouse appliances, Data warehousing, Database compression, EMC, Exadata, Greenplum, Netezza, Oracle, Pricing
Subscribe to our complete feed!
Comments
8 Responses to “Notes on data warehouse appliance prices”
Leave a Reply
Curt,
Are these prices inclusive of services or just the hardware+software prices? We dont get fairly to see much on pricing in respect of appliances including Teradata.
Regards
Rama.
Rama,
No professional services, to my knowledge.
CAM
The challenge with any pricing is looking at the overall costs and not simply the purchase costs. The typical cycle for other vendors (as you know I am COO of a company that has created software that doesn’t miraculously grow bigger just to make it perform over time – so this is with the usual vendor caveats – I may be biased, it doesn’t mean I am wrong; safe harbour statement made) – the typical cycle is customers have a problem, fix the problem and then once you are hooked on the architecture, as the problem gets larger and more complex over time the database starts to grow out of control. It doesn’t matter if you call it materialized views, materialized join indexes, or automated tuning; you have to love marketing speak for “We’re going to copy more data to disk”. The database keeps getting bigger just to make it keep performing. It requires people, processing capacity and it requires more and more licences. If there was a way to measure the gap between what you thought you were paying for and what you ended up paying for, that would be interesting. Like having lunch at McDonald’s and measuring the costs of your arteries blocking up. It is a fair question if you price by Terabyte, then making your product perform by allowing it to grow bigger, and then charging more for data, isn’t that like going to McDonald’s and getting hungrier with each bite?
I recently tried to find the demarcation line on where a low-cost ends and premium-priced begins, ultimately coming to conclusion that TCO (hardware+software+running costs) of roughly 1 $/GB/month is a reference point for a low-cost basic system. It is for uncompressed user data that is providing active and interactive (think google) querying.
I’m also curious if BigQuery launch will establish some good reference point for the industry, and then we could reason “that much premium for that much added-value”.
See more here http://bigdatacraft.com/archives/135
Mike,
I have trouble thinking of a company that charges by terabyte for materialized views or indexes. Who are you thinking of?
Thanks,
CAM
[…] Notes on data warehouse appliance prices […]
Dear All,
I am interested in implementing datawarehouse for my organization. Wer have 5 different locations and we need to be able to have all the date available in one locatiion. Could you recommend the correct solution for this and the price?
Hope to hear from you soon
Christopher
Dear All,
I am interested in implementing datawarehouse for my organization. We have 5 different locations and we need to be able to have all the data available in one locatiion. Could you recommend the correct solution for this and the price?
Hope to hear from you soon
Christopher