Oracle

Analysis of software titan Oracle and its efforts in database management, analytics, and middleware. Related subjects include:

Oracle TimesTen
(in The Monash Report)Operational and strategic issues for Oracle
(in Software Memories) Historical notes on Oracle
Most of what’s written about in this blog

October 25, 2009

Reports of perfectly-balanced hardware configurations are greatly exaggerated

Data warehouse appliance and software appliance vendors like to claim that they’ve worked out just the right hardware configuration(s), and that a single configuration is correct for a fairly broad range of workloads. But there are a lot of reasons to be dubious about that. Specific vendor evidence includes:

Teradata ascribes considerable importance to a Virtual Storage technology whose main purpose is to allow mixing of heterogeneous storage devices in a single system. And the discussion rarely suggests that these parts will be in a rigid fixed relationship.
Netezza — as Teradata keeps reminding me — often sells boxes with the expectation that they won’t be filled with data, so as to increase spindle count and hence performance.
Oracle/Sun have dropped some comments about Exadata being more flexibly configured going forward.
Kickfire’s new “high-end” appliance lets you attach fairly arbitrary amounts of external storage.
And of course, software-only analytic DBMS vendors run their software in all sorts of hardware and storage environments.

What’s more, the claim never made a lot of sense anyway. With the rarest of exceptions, even a single data warehouse’s workload will contain different queries that strain different parts of the system in different ratios. Calculating the “ideal” hardware configuration for that single workload would be forbiddingly difficult. And even if one could calculate it, it almost surely would be different than another user’s “ideal” configuration. How a single hardware configuration can be “ideally balanced” for a broad class of use cases boggles the imagination.

Categories: Data warehouse appliances, Data warehousing, Exadata, Kickfire, Netezza, Oracle, Teradata

6 Comments

October 6, 2009

Oracle Exadata customers presenting at Oracle Open World

Greg Rahn tweeted a list of Exadata-focused sessions at Oracle Open World next week. As Oracle employees and supporters have been foreshadowing, there will be Exadata users and user-like folks presenting. I identified what look like half a dozen (not counting any who, for example, will make surprise appearances at keynote addresses), specifically: Read more

Categories: Data warehousing, Exadata, Market share and customer counts, Oracle, Teradata

5 Comments

October 6, 2009

Oracle and Vertica on compression and other physical data layout features

In my recent post on Exadata pricing, I highlighted the importance of Oracle’s compression figures to the discussion, and the uncertainty about same. This led to a Twitter discussion featuring Greg Rahn* of Oracle and Dave Menninger and Omer Trajman of Vertica. I also followed up with Omer on the phone. Read more

Categories: Columnar database management, Data models and architecture, Data warehousing, Database compression, Oracle, Theory and architecture, Vertica Systems

14 Comments

October 6, 2009

Oracle’s version of “actually, we’ve been doing MapReduce all along too”

In a recent blog post, Jean-Pierre Dijcks of Oracle makes the argument that Oracle has supported MapReduce all along, essentially because:

You can do lots of procedural logic in the Oracle database, in a broad choice of languages, so in particular you can do Map steps.
You can do lots of procedural logic in the Oracle database, in a broad choice of languages, so in particular you can do Reduce steps.
Oracle offers a mechanism for parallelizing procedural logic.

Oracle doesn’t appear to have an explicit Map/Reduce programming interface, but I wouldn’t be surprised if Oracle Consulting cranked one out at some point to meet customer demand.

The post goes on to claim the usual in-database MapReduce benefit of avoiding the overhead of intermediate query result materialization. Presumably, then, Oracle’s quasi-MapReduce would also lack query fault-tolerance.

Categories: Analytic technologies, MapReduce, Oracle, Parallelization

1 Comment

October 5, 2009

Oracle Exadata 2 capacity pricing

Summary of Oracle Exadata 2 capacity pricing

Analyzing Oracle Exadata pricing is always harder than one would first think. But I’ve finally gotten around to doing an Oracle Exadata 2 pricing spreadsheet. The main takeaways are:

If we believe Oracle’s claims of 10X compression, Exadata 2 costs more per terabyte of user data than Netezza TwinFin — $22-26K/TB vs. TwinFin’s <$20K — but less than the Teradata 2550.
These figures are highly sensitive to assumptions about Oracle’s hybrid columnar compression.
Similarly, if Netezza or Teradata were to significantly upgrade their own compression, the price comparison would look quite different.
Options such as Data Mining or Oracle Spatial add 12% or so each to Exadata’s total system price.

Longer version

When Oracle introduced Exadata last year it was, well, expensive. Exadata 2 has now been announced, and it is significantly cheaper than Exadata 1 per terabyte of user data, based on:

Similar overall pricing
Twice the disk capacity
Better compression

13 Comments

October 1, 2009

Yahoo wants to do decapetabyte-scale data warehousing in Hadoop

My old client Mark Tsimelzon moved over to Yahoo after Coral8 was acquired, and I caught up with him last month. He turns out to be running development for a significant portion of Yahoo’s Hadoop effort — everything other than HDFS (Hadoop Distributed File System). Yahoo evidently plans to, within a year or so, get Hadoop to the point that it is managing 10s of petabytes of data for Yahoo, with reasonable data warehousing functionality.

Highlights of our visit included:

There are dozens of people at Yahoo doing Hadoop development that will wind up getting open sourced. (Full-time or close to it.) In particular, everything Mark’s team does goes to open source.
Yahoo is moving as much of its analytics to Hadoop as possible. Much of this is being moved away from Oracle and from Yahoo’s own Everest.
A column store is being put on top of HDFS, based on Yahoo technology. Columns will be striped across nodes. Perhaps that’s why the effort is called Project Zebra.
Mark believes that in a year Hadoop will be much further along in meeting traditional data warehousing requirements, in areas such as:
- Metadata
- SLAs/high availability/other workload management
- Data retention policies
- Security/privacy*
Yahoo views the time-to-market benefits of Hadoop as being more important than TCO.

6 Comments

September 30, 2009

Facts and rumors

Vertica is putting out a press release today touting its 100th customer, and talking of triple digit growth last year.
Multiple sources have told me that the DATAllegro system is being thrown out of Dell, so evidently Dell is telling this to one and all. If that goes through, this would presumably leave TEOCO as DATAllegro’s single happy customer. (I haven’t checked with Microsoft for its view.)
A rumor has it that Infiniband technology vendor Voltaire, Ltd. privately claims triple-digit sales of switches for Exadata 1 (I think that one would be one switch per Exadata installation, not per rack). Based just on a quick glance, this is far from confirmed by Voltaire’s earnings conference call transcripts or SEC filings. However, the most recent transcript does seem to indicate Voltaire got multiple Exadata deals in the telecommunications sector, and suggests some Exadata penetration in other sectors as well.
I was told of a classified-agency user that has >1 petabyte of data on Exadata 1 and 600 terabytes or so on Netezza. My not-obviously-biased source says the agency is distinctly happier with Netezza than Exadata.
Like ParAccel, Oracle just got dinged for TPC-related misbehavior.
Rumor has it that Sun has no intention of helping ParAccel rerun its withdrawn TPC-H benchmark.
ParAccel has withdrawn the claim from its home page to be the “CERTIFIED” price-performance leader. This seems to confirm that the claim was a reference to the TPC-H. In my opinion, that was a gross misrepresentation of what the TPC-H shows.

Categories: Benchmarks and POCs, Data warehouse appliances, Data warehousing, DATAllegro, Exadata, Market share and customer counts, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Petabyte-scale data management, Specific users, Telecommunications

3 Comments

September 29, 2009

What Nielsen really uses in data warehousing DBMS

In its latest earnings call, Oracle made a reference to The Nielsen Company that was — to put it politely — rather confusing. I just plopped down in a chair next to Greg Goff, who evidently runs data warehousing at Nielsen, and had a quick chat. Here’s the real story.

The Nielsen Company has over half a petabyte of data on Netezza in the US. This installation is growing.
The Nielsen Company indeed has 45 terabytes or whatever of data on Oracle in its European (Customer) Information Factory. This is not particularly growing. Nielsen’s Oracle data warehouse has been built up over the past 9 years. It’s not new. It’s certainly not on Exadata, nor planned to move to Exadata.
These are not single-instance databases. Nielsen’s biggest single Netezza database is 20 terabytes or so of user data, and its biggest single Oracle database is 10 terabytes or so.
Much (most?) of the rest of the installations are customer data marts and the like, based in each case on the “big” central database. (That’s actually a classic data mart use case.) Greg said that Netezza’s capabilities to spin out those databases seemed pretty good.
That 10 terabyte Oracle data warehouse instance requires a lot of partitioning effort and so on in the usual way.
Nielsen has no immediate plans to replace Oracle with Netezza.
Nielsen actually has 800 terabytes or so of Netezza equipment. Some of that is kept more lightly loaded, for performance.

Categories: Analytic technologies, Data mart outsourcing, Data warehouse appliances, Data warehousing, Netezza, Oracle, Specific users

6 Comments

September 29, 2009

Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2

Oracle is pushing Exadata 2 as being a great system for any of OLTP (OnLine Transaction Processing), data warehousing or, presumably, the integration of same. This claim rests on a few premises, namely: Read more

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Exadata, OLTP, Oracle, Solid-state memory, Theory and architecture

36 Comments

September 25, 2009

The hunt for Oracle Exadata production references

Over the past four weeks, I’ve given speeches in Boston, DC, Milan, London, and SF,* attended a conference in Lyon, done a fair amount of consulting, and taken a few non-client briefings as well. That’s why I haven’t had much of a chance to sit down, analyze the tea leaves, and write about Exadata 2. (Small exception: Highlights from and remarks on the Oracle Database 11g Release 2 white paper.) I hope to do that soon.

*I’ll bop over to Chicago for the last of the series early next week.

But first — can anybody identify much in the way of Exadata production references? Oracle recently talked of a few flagship data warehouse customers, but those don’t seem to be running Exadata. I talked recently with an Oracle prospect from the US, who only got one reference from Oracle — in Eastern Europe. (Well, two references, if you also count the system integrator on the same deal.)

So far as I can tell, Oracle Exadata production sites are pretty scarce on the ground. What, if anything, am I missing?

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Exadata, Market share and customer counts, Oracle

17 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Oracle

Reports of perfectly-balanced hardware configurations are greatly exaggerated

Oracle Exadata customers presenting at Oracle Open World

Oracle and Vertica on compression and other physical data layout features

Oracle’s version of “actually, we’ve been doing MapReduce all along too”

Oracle Exadata 2 capacity pricing

Yahoo wants to do decapetabyte-scale data warehousing in Hadoop

Facts and rumors

What Nielsen really uses in data warehousing DBMS

Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2

The hunt for Oracle Exadata production references

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin