Data warehouse appliances

Analysis of data warehouse appliances – i.e., of hardware/software bundles optimized for fast query and analysis of large volumes of (usually) relational data. Related subjects include:

Data warehousing
Parallelization
Netezza
DATAllegro
Teradata
Kickfire
(in The Monash Report) Computing appliances in multiple domains

October 18, 2009

Kickfire capacity and pricing

Kickfire’s marketing communication efforts are still a work in progress. Kickfire did finally relax its secrecy about FPGA-vs.-custom-silicon – not coincidentally during Netezza’s recent publicity cycle. That wise choice helped Kickfire get some favorable attention recently for its technical and market strategy, e.g. from Daniel Abadi, Merv Adrian and, kicking things off — as it were — me. Weeks after a recent Kickfire product release, there’s finally a fairly accurate data sheet up, although there’s still one self-defeatingly misleading line I’ll comment on below. Pricing is a whole other area of confusion, although it seems that current list prices have been inadvertently* leaked in Merv’s post linked above, with only one inaccuracy that I can detect.**

*I gather from the company that they forgot to tell Merv pricing was NDA.

** Merv cited a price as “starting” that I believe to be top-of-the-line. No criticism of Merv is implied in that; Kickfire has not been very clear in communicating hard numbers.

All that said, if one takes Kickfire’s marketing statements literally, Kickfire list pricing is around $20-50K per terabyte for a few small, fixed, high-performance configurations. That’s all-in, for plug-and-play appliances. What’s more, that range is based on the actual published user data capacity numbers for various Kickfire models, which I think are low for several reasons:

Kickfire doesn’t officially admit that its model with 14.4 terabytes of disk can manage more than 6 terabytes of data, even though it clearly can.
Actually, those 14.4 terabytes of disk can be increased or lowered as you choose.
The basic compression figures implied in those calculations seem conservative.
Compression figures are a lot more conservative yet, in that Kickfire assumes you’ll have a lot of actual indexes on your data. I’m not sure that’s necessary for most workloads.

Categories: Columnar database management, Data warehouse appliances, Data warehousing, Database compression, Kickfire, Pricing

3 Comments

October 5, 2009

Oracle Exadata 2 capacity pricing

Summary of Oracle Exadata 2 capacity pricing

Analyzing Oracle Exadata pricing is always harder than one would first think. But I’ve finally gotten around to doing an Oracle Exadata 2 pricing spreadsheet. The main takeaways are:

If we believe Oracle’s claims of 10X compression, Exadata 2 costs more per terabyte of user data than Netezza TwinFin — $22-26K/TB vs. TwinFin’s <$20K — but less than the Teradata 2550.
These figures are highly sensitive to assumptions about Oracle’s hybrid columnar compression.
Similarly, if Netezza or Teradata were to significantly upgrade their own compression, the price comparison would look quite different.
Options such as Data Mining or Oracle Spatial add 12% or so each to Exadata’s total system price.

Longer version

When Oracle introduced Exadata last year it was, well, expensive. Exadata 2 has now been announced, and it is significantly cheaper than Exadata 1 per terabyte of user data, based on:

Similar overall pricing
Twice the disk capacity
Better compression

13 Comments

September 30, 2009

Facts and rumors

Vertica is putting out a press release today touting its 100th customer, and talking of triple digit growth last year.
Multiple sources have told me that the DATAllegro system is being thrown out of Dell, so evidently Dell is telling this to one and all. If that goes through, this would presumably leave TEOCO as DATAllegro’s single happy customer. (I haven’t checked with Microsoft for its view.)
A rumor has it that Infiniband technology vendor Voltaire, Ltd. privately claims triple-digit sales of switches for Exadata 1 (I think that one would be one switch per Exadata installation, not per rack). Based just on a quick glance, this is far from confirmed by Voltaire’s earnings conference call transcripts or SEC filings. However, the most recent transcript does seem to indicate Voltaire got multiple Exadata deals in the telecommunications sector, and suggests some Exadata penetration in other sectors as well.
I was told of a classified-agency user that has >1 petabyte of data on Exadata 1 and 600 terabytes or so on Netezza. My not-obviously-biased source says the agency is distinctly happier with Netezza than Exadata.
Like ParAccel, Oracle just got dinged for TPC-related misbehavior.
Rumor has it that Sun has no intention of helping ParAccel rerun its withdrawn TPC-H benchmark.
ParAccel has withdrawn the claim from its home page to be the “CERTIFIED” price-performance leader. This seems to confirm that the claim was a reference to the TPC-H. In my opinion, that was a gross misrepresentation of what the TPC-H shows.

Categories: Benchmarks and POCs, Data warehouse appliances, Data warehousing, DATAllegro, Exadata, Market share and customer counts, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Petabyte-scale data management, Specific users, Telecommunications

3 Comments

September 29, 2009

What Nielsen really uses in data warehousing DBMS

In its latest earnings call, Oracle made a reference to The Nielsen Company that was — to put it politely — rather confusing. I just plopped down in a chair next to Greg Goff, who evidently runs data warehousing at Nielsen, and had a quick chat. Here’s the real story.

The Nielsen Company has over half a petabyte of data on Netezza in the US. This installation is growing.
The Nielsen Company indeed has 45 terabytes or whatever of data on Oracle in its European (Customer) Information Factory. This is not particularly growing. Nielsen’s Oracle data warehouse has been built up over the past 9 years. It’s not new. It’s certainly not on Exadata, nor planned to move to Exadata.
These are not single-instance databases. Nielsen’s biggest single Netezza database is 20 terabytes or so of user data, and its biggest single Oracle database is 10 terabytes or so.
Much (most?) of the rest of the installations are customer data marts and the like, based in each case on the “big” central database. (That’s actually a classic data mart use case.) Greg said that Netezza’s capabilities to spin out those databases seemed pretty good.
That 10 terabyte Oracle data warehouse instance requires a lot of partitioning effort and so on in the usual way.
Nielsen has no immediate plans to replace Oracle with Netezza.
Nielsen actually has 800 terabytes or so of Netezza equipment. Some of that is kept more lightly loaded, for performance.

Categories: Analytic technologies, Data mart outsourcing, Data warehouse appliances, Data warehousing, Netezza, Oracle, Specific users

6 Comments

September 29, 2009

Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2

Oracle is pushing Exadata 2 as being a great system for any of OLTP (OnLine Transaction Processing), data warehousing or, presumably, the integration of same. This claim rests on a few premises, namely: Read more

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Exadata, OLTP, Oracle, Solid-state memory, Theory and architecture

36 Comments

September 25, 2009

The hunt for Oracle Exadata production references

Over the past four weeks, I’ve given speeches in Boston, DC, Milan, London, and SF,* attended a conference in Lyon, done a fair amount of consulting, and taken a few non-client briefings as well. That’s why I haven’t had much of a chance to sit down, analyze the tea leaves, and write about Exadata 2. (Small exception: Highlights from and remarks on the Oracle Database 11g Release 2 white paper.) I hope to do that soon.

*I’ll bop over to Chicago for the last of the series early next week.

But first — can anybody identify much in the way of Exadata production references? Oracle recently talked of a few flagship data warehouse customers, but those don’t seem to be running Exadata. I talked recently with an Oracle prospect from the US, who only got one reference from Oracle — in Eastern Europe. (Well, two references, if you also count the system integrator on the same deal.)

So far as I can tell, Oracle Exadata production sites are pretty scarce on the ground. What, if anything, am I missing?

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Exadata, Market share and customer counts, Oracle

17 Comments

September 21, 2009

Notes on the Oracle Database 11g Release 2 white paper

The Oracle Database 11g Release 2 white paper I cited a couple of weeks ago has evidently been edited, given that a phrase I quoted last month is no longer to be found. Anyhow, here are some quotes from and comments on what evidently is the latest version. Read more

Categories: Analytic technologies, Archiving and information preservation, Cache, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, Exadata, Memory-centric data management, OLTP, Oracle, Oracle TimesTen, Parallelization, Solid-state memory, Storage, Theory and architecture

8 Comments

September 3, 2009

Teradata really means that those 100+ appliances are in PRODUCTION

I was misremembering. It turns out that when Teradata said it had over 100 appliances “in production”, it meant that >100 hardware-based appliances are actually in production. If you add in the software-only “appliances,” and count test/development as well as true production, the total rises to >200.

I tried to get a finer breakdown out of Teradata on a disclosable basis, but failed. The ostensible reason is that public companies often don’t do that sort of thing without permission from the investor relations department, and Teradata’s marketers evidently haven’t felt a sense of urgency about getting permission to, for example, communicate how well just the 25xx series is doing.

Categories: Data warehouse appliances, Data warehousing, Market share and customer counts, Teradata

1 Comment

September 3, 2009

SAS on Netezza and other Netezza extensibility

I chatted with SAS CTO Keith Collins yesterday about the new SAS/Netezza in-database parallel data mining scoring offering. My impression is that this is very similar to SAS’ current Teradata support, notwithstanding SAS’ and Teradata’s apparent original intention of offering in-database modeling by now as well.

I gather this is a big performance-enhancing deal, just as it is for SPSS or Oracle’s own data mining over Oracle. However, I must confess to not yet understanding why. That is, I don’t know what’s so complicated about data mining scoring algorithms that makes hand-coding them in SQL particularly forbidding. My naive view of data mining is that you do a big regression to get a bunch of weights, and the resulting scoring algorithm is a linear combination of a few dozen variables. Evidently, that’s not quite right.

Anyhow, it turns out that SAS held off on this work until it could be done for TwinFin. That’s largely because TwinFin lets partners write code on Intel CPUs, while previously they had to write in C for Netezza’s FPGAs. I got a similar sense from at least one other Netezza partner as well.

Categories: Data warehouse appliances, Data warehousing, Netezza, Predictive modeling and advanced analytics, SAS Institute

5 Comments

September 2, 2009

Teradata has over 100 appliances in production

I recently wrote that Teradata had gotten serious about appliance product lines, and had non-trivial sales figures for them. In a press release today, Teradata is now explicitly saying (emphasis mine):

Teradata now has more than 100 appliances in production, including the Data Mart Appliance 551, the Data Warehouse Appliance 2550, and the Extreme Data Appliance 1550, which complement the core platform, the Teradata Active Enterprise Data Warehouse 5550.

The breakdowns on that are NDA, and anyhow I can’t find them immediately in my notes.* But if memory serves — while a lot of those appliances are used for test and development, a whole other lot of them are used to do actual production query-answering work. (Edit: Memory turned out to be wrong.) Read more

Categories: Data warehouse appliances, Data warehousing, Market share and customer counts, Teradata

2 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in