Data warehouse appliances
Analysis of data warehouse appliances – i.e., of hardware/software bundles optimized for fast query and analysis of large volumes of (usually) relational data. Related subjects include:
- Data warehousing
- Parallelization
- Netezza
- DATAllegro
- Teradata
- Kickfire
- (in The Monash Report) Computing appliances in multiple domains
HP Neoview in the market to date
I evidently got HP’s attention by a recent post in which I questioned its stance on the relative positioning of the Exadata-based HP Oracle data warehouse appliance and the HP Neoview data warehouse appliance. A conversation with Greg Battas and John Miller (respectively CTO and CMO of HP’s BI group) quickly ensued. Mainly we talked about Neoview product goals and architecture. But before I get to that in a separate post, here are some Neoview market-presence highlights, so far as I’ve been able to figure them out: Read more
Categories: Data warehouse appliances, Data warehousing, HP and Neoview | 1 Comment |
Automatic redistribution of data warehouse data
In a recent Oracle Exadata FAQ, Kevin Closson writes:
Q. […] don’t some of the DW vendors split the data up in a shared nothing method. Thus when the data has to be repartitioned it gets expensive. Whereas here you just add another cell and ASM goes to work in the background. (depending upon the ASM power level you set.)
A. All the DW Appliance vendors implement shared-nothing so, yes, the data is chopped up into physical partitions. If you add hardware to increase performance of queries against your current dataset the data will have to be reloaded into the new partitioning scheme. As has always been the case with ASM, adding new disks-and therefore Exadata Storage Server cells-will cause the existing data to be redistributed automatically over all (including the new) drives. This ASM data redistribution is an online function.
Hmm. That sounds much like the story I’ve heard from various other data warehousing DBMS vendors as well.
Rather than try to speak for them, however, I’ll just post this and see whether they choose to add anything to the comment thread.
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle | 7 Comments |
Greenplum pricing
Edit: Actually, this post is completely incorrect. The $20K/terabyte is for software only. So far, my attempts to get Greenplum to estimate hardware costs have been unsuccessful.
Greenplum’s Scott Yara was recently quoted citing a $20K/terabyte figure for Greenplum pricing. That naturally raises the question:
Greenplum charges around $20K/terabyte of what?
Categories: Data warehouse appliances, Data warehousing, Greenplum, Pricing | 4 Comments |
Oracle Database Machine and Exadata pricing: Part 2
My Oracle Database Machine and Exadata pricing spreadsheet has been updated. Specifically:
- The first page has been modestly altered to accommodate more chargeable software options, as per the discussion below.
- Accordingly, my new estimate for HP Oracle Database Machine list price is $5,546,000. Per-terabyte prices (user data) are $60K and $198K for the two configurations.
- There’s a whole new second page, for Exadata configurations smaller than a full Oracle Database Machine. Most of the work on that was done by Bence Arató of BI Consulting (Hungary), who graciously gave me permission to post it.
- The lowest per-terabyte Exadata price estimates are about 20% lower than for the full Oracle Database Machine. The difference is due mainly to eliminating Real Application Clusters for a single-node SMP machine, and secondarily to rounding down slightly on server hardware capacity. But these are rough estimates, as neither Bence nor I is a hardware pricing guy.
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle, Pricing | 11 Comments |
Eric Lai on Oracle Exadata, and some addenda
Eric Lai offers a detailed FAQ on Oracle Exadata, including a good selection of links and quotes. I’d like to offer a few comments in response: Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, Greenplum, Netezza, Oracle, Pricing | 4 Comments |
Exadata and Oracle Database Machine parallelization clarified
Some kind Oracle development managers have reached out and helped me better understand where Oracle does or doesn’t stand in query and analytic parallelization. This post supersedes prior discussions of the subject over the past week. Read more
Categories: Clustering, Data warehouse appliances, Data warehousing, Exadata, Oracle, Parallelization | 10 Comments |
Oracle Database Machine performance and compression
Greg Rahn was kind enough to recite in his blog what Oracle has disclosed about the first Exadata testers. I don’t track hardware model details, so I don’t know how the testers’ respective current hardware environments compare to that of the Oracle Database Machine.
Each of the customers cited below received “half” an Oracle Database Machine. As I previously noted, an Oracle Database Machine holds either 14.0 or 46.2 terabytes of uncompressed data. This suggests the 220 TB customer listed below — LGR Telecommunications — got compression of a little under 10:1 for a CDR (Call Detail Record) database. By comparison, Vertica claims 8:1 compression on CDRs.
Greg also writes of POS (Point Of Sale) data being used for the demo. If you do the arithmetic on the throughput figures (13.5 vs. a little over 3), compression was a little under 4.5:1. I don’t know what other vendors claim for POS compression.
Here are the details Greg posted about the four most open Oracle Database Machine tests: Read more
Categories: Data warehouse appliances, Data warehousing, Database compression, Exadata, Oracle, Telecommunications | 9 Comments |
Oracle Exadata list pricing
The figures in this post have now been updated. There’s a new spreadsheet at that link as well.
I’ve been trying to figure out how much Oracle Exadata actually costs. My first cut comes up with prices of $58-190K/TB (user data), based on a total system price of $5,322,000, and user data figures of 28 and 92.4 TB for the two available sizes of disk drive. But of course there are a lot of uncertainties in these figures. You can use this spreadsheet (Edit: That’s the old one) to see where the final numbers come from, and to modify the estimates as you see fit. Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle, Pricing | 10 Comments |
Oracle Exadata Smart Scan Join Processing
Oracle has put up an Exadata white paper (hat tip to Kevin Closson’s Exadata FAQ). There’s a section on Smart Scan Join Processing. Sounds exciting, huh? It reads, in its entirety:
Exadata performs joins between large tables and small lookup tables, a very common scenario for data warehouses with star schemas. This is implemented using Bloom Filters, which are a very efficient probabilistic method to determine whether a row is a member of the desired result set.
Jeez. That almost sounds as if Exadata is an immature, Release 1 data warehouse appliance!
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle | 14 Comments |
So what does Oracle Exadata mean for HP Neoview?
That HP is committed to selling a lot of data warehouse hardware — and probably data warehouse appliances in particular — seems obvious, for reasons including:
- HP bought a big BI/data warehousing consulting operation in Knightsbridge.
- HP has put considerable effort into its data warehouse appliance Neoview.
- HP CEO Mark Hurd comes from data warehouse appliance vendor Teradata.
- Data warehousing where the big bucks are.
But Oracle Exadata could produce those appliance sales. So where does HP Neoview fit in?
I was told by an investor today that HP’s investor relations department is saying Oracle Exadata is a Netezza competitor, while Neoview is more in the Teradata market. That’s laughable. Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, HP and Neoview, Netezza, Teradata | 16 Comments |