Columnar database management

Analysis of products and issues in column-oriented database management systems. Related subjects include:

January 15, 2010

Vertica slaughters Sybase in patent litigation

Back in August, 2008, I pooh-poohed Sybase’s patent lawsuit against Vertica. Filed in the notoriously patent-holder-friendly East Texas courts, the suit basically claimed patent rights over the whole idea of a columnar RDBMS. It was pretty clear that this suit was meant to be a model for claims against other columnar RDBMS vendors as well, should they ever achieve material marketplace success.

If a recent Vertica press release is to be believed, Sybase got clobbered. The meat is:

… Sybase has admitted that under the claim construction order issued by the Court on November 9, 2009, “Vertica does not infringe Claims 1-15 of U.S. Patent No. 5,794,229.” Sybase further acknowledged that because the Court ruled that all the remaining claims in the patent (claims 16-24) were invalid, “Sybase cannot prevail on those claims.”

For those counting along at home — the patent only has 24 claims in total.

I have no idea whether Sybase can still cobble together grounds for appeal, or claims under some other patent. But for now, this sounds like a total victory for Vertica.

Edit: I’ve now seen a PDF of a filing suggesting the grounds under which Sybase will appeal. Basically, it alleges that the judge erred in defining a “page” of data too narrowly. Note that if Sybase prevails on appeal on that point, Vertica has a bunch of other defenses that haven’t been litigated yet. It further seems that Sybase may have recently filed another patent case against Vertica, in a different venue, based on a different patent.

One annoying blog troll excepted, is anybody surprised at this outcome?

Categories: Columnar database management, Data warehousing, Sybase, Vertica Systems

6 Comments

January 15, 2010

There sure seem to be a lot of inaccuracies on ParAccel’s website

In what is actually an interesting post on database compression, ParAccel CTO Barry Zane threw in

Anyone who has met with us knows ParAccel shies away from hype.

But like many things ParAccel says, that is not true.

Edit (October, 2010): Like other posts I’ve linked to from Barry Zane’s blog, that one seems to be gone, with the URL redirecting elsewhere on ParAccel’s website.

The latest whoppers came in the form of several customers ParAccel listed on its website who hadn’t actually bought ParAccel’s DBMS, nor even decided to do so. It is fairly common to to claim a customer win, then retract the claim due to lack of permission to disclose. But that’s not what happened in these cases. Based on emails helpfully shared by a ParAccel competitor competing in some of those accounts, it seems clear that ParAccel actually posted fabricated claims of customer wins. Read more

Categories: Columnar database management, Data warehousing, Database compression, Market share and customer counts, ParAccel, Telecommunications

24 Comments

December 29, 2009

This and that

I have various subjects backed up that I don’t really want to write about at traditional blog-post length. Here are a few of them. Read more

Categories: Analytic technologies, Columnar database management, MarkLogic, Open source, Oracle, Streaming and complex event processing (CEP), Structured documents, Theory and architecture, Vertica Systems

2 Comments

November 23, 2009

Boston Big Data Summit keynote outline

Last month, Bob Zurek asked me to give a talk on “Big Data”, where “big” is anything from a few terabytes on up, then moderate a panel on cloud computing. We agreed that I could talk just from notes, without slides. So, since I have them typed up, I’m posting them below.

Categories: Analytic technologies, Archiving and information preservation, Business intelligence, Cloud computing, Clustering, Columnar database management, Data warehouse appliances, Data warehousing, DBMS product categories, Humor, Investment research and trading, Log analysis, MapReduce, Market share and customer counts, NoSQL, OLTP, Open source, Parallelization, Presentations, Pricing, Solid-state memory, Storage, Telecommunications, Theory and architecture, Web analytics

6 Comments

November 7, 2009

Calpont’s InfiniDB

Since its inception, Calpont has gone through multiple management teams, strategies, and investor groups. What it hadn’t done, ever, is actually shipped a product. Last week, however, Calpont introduced a free/open source DBMS, InfiniDB, with technical details somewhat reminiscent of what Calpont was promising last April. Highlights include:

Like Infobright, Calpont’s InfiniDB is a columnar DBMS consisting of a MySQL front end and a columnar storage engine.
Community edition InfiniDB runs on a single server.
One of commercial/enterprise edition InfiniDB’s main claims to fame will be MPP support.
There’s no announced time frame for commercial edition InfiniDB.
InfiniDB’s current compression story is dictionary/token only, with decompression occurring before joins are executed. Improvement is a roadmap item.
Indeed, InfiniDB has many roadmap items, a few of which can be found here. Also, a great overview of InfiniDB’s current state and roadmap can be found in this MySQL Performance Blog thread. (And follow the links there to find performance discussions of other free analytic DBMS.)
One thing InfiniDB already has that is still a roadmap item for Infobright is the ability to run a query across multiple cores at once.
One thing free InfiniDB has that Infobright only offers in its Enterprise Edition is ACID-compliant Insert/Update/Delete. (Note: I wish people would stop saying that Infobright Enterprise Edition isn’t ACID-compliant, since that point was cleared up a while ago.)
InfiniDB has no indexes or materialized views.
However, InfiniDB’s retrieval is expedited by something called “Extents,” which sounds a lot like Netezza’s zone maps.

Being on vacation, I’ll stop there for now. (If it weren’t for Tropical Storm/ depression Ida, I might not even be posting this much until I get back.)

Categories: Analytic technologies, Calpont, Columnar database management, Data warehousing, Database compression, Infobright, MySQL, Open source

3 Comments

October 18, 2009

Introduction to SenSage

I visited with SenSage on my two most recent trips to San Francisco. Both visits were, through no fault of SenSage’s, hasty. Still, I think I have enough of a handle on SenSage basics to be worth writing up.

General SenSage highlights include:

Categories: Analytic technologies, Columnar database management, Data warehousing, Database compression, Log analysis, MapReduce, SenSage, Streaming and complex event processing (CEP), Telecommunications

3 Comments

October 18, 2009

Kickfire capacity and pricing

Kickfire’s marketing communication efforts are still a work in progress. Kickfire did finally relax its secrecy about FPGA-vs.-custom-silicon – not coincidentally during Netezza’s recent publicity cycle. That wise choice helped Kickfire get some favorable attention recently for its technical and market strategy, e.g. from Daniel Abadi, Merv Adrian and, kicking things off — as it were — me. Weeks after a recent Kickfire product release, there’s finally a fairly accurate data sheet up, although there’s still one self-defeatingly misleading line I’ll comment on below. Pricing is a whole other area of confusion, although it seems that current list prices have been inadvertently* leaked in Merv’s post linked above, with only one inaccuracy that I can detect.**

*I gather from the company that they forgot to tell Merv pricing was NDA.

** Merv cited a price as “starting” that I believe to be top-of-the-line. No criticism of Merv is implied in that; Kickfire has not been very clear in communicating hard numbers.

All that said, if one takes Kickfire’s marketing statements literally, Kickfire list pricing is around $20-50K per terabyte for a few small, fixed, high-performance configurations. That’s all-in, for plug-and-play appliances. What’s more, that range is based on the actual published user data capacity numbers for various Kickfire models, which I think are low for several reasons:

Kickfire doesn’t officially admit that its model with 14.4 terabytes of disk can manage more than 6 terabytes of data, even though it clearly can.
Actually, those 14.4 terabytes of disk can be increased or lowered as you choose.
The basic compression figures implied in those calculations seem conservative.
Compression figures are a lot more conservative yet, in that Kickfire assumes you’ll have a lot of actual indexes on your data. I’m not sure that’s necessary for most workloads.

Categories: Columnar database management, Data warehouse appliances, Data warehousing, Database compression, Kickfire, Pricing

3 Comments

October 14, 2009

Greenplum is going hybrid columnar as well

Over the past summer, Vertica, VectorWise, and Oracle all announced flavors of hybrid row/columnar storage. Now it’s Greenplum’s turn. Greenplum is actually offering true columnar storage, as opposed to Oracle’s PAX-like scheme — and also as opposed to the kind of Frankencolumn storage Daniel Abadi decries. For example, you don’t have to do a join to retrieve multiple columns; you just ask for them and there they are. Similarly, Greenplum doesn’t maintain explicit row IDs – whether in row-oriented or column-oriented append-only storage – relying instead on block-level header information. Read more

Categories: Analytic technologies, Columnar database management, Data warehousing, Database compression, Greenplum, Theory and architecture

12 Comments

October 6, 2009

Oracle and Vertica on compression and other physical data layout features

In my recent post on Exadata pricing, I highlighted the importance of Oracle’s compression figures to the discussion, and the uncertainty about same. This led to a Twitter discussion featuring Greg Rahn* of Oracle and Dave Menninger and Omer Trajman of Vertica. I also followed up with Omer on the phone. Read more

Categories: Columnar database management, Data models and architecture, Data warehousing, Database compression, Oracle, Theory and architecture, Vertica Systems

14 Comments

October 5, 2009

Oracle Exadata 2 capacity pricing

Summary of Oracle Exadata 2 capacity pricing

Analyzing Oracle Exadata pricing is always harder than one would first think. But I’ve finally gotten around to doing an Oracle Exadata 2 pricing spreadsheet. The main takeaways are:

If we believe Oracle’s claims of 10X compression, Exadata 2 costs more per terabyte of user data than Netezza TwinFin — $22-26K/TB vs. TwinFin’s <$20K — but less than the Teradata 2550.
These figures are highly sensitive to assumptions about Oracle’s hybrid columnar compression.
Similarly, if Netezza or Teradata were to significantly upgrade their own compression, the price comparison would look quite different.
Options such as Data Mining or Oracle Spatial add 12% or so each to Exadata’s total system price.

Longer version

When Oracle introduced Exadata last year it was, well, expensive. Exadata 2 has now been announced, and it is significantly cheaper than Exadata 1 per terabyte of user data, based on:

Similar overall pricing
Twice the disk capacity
Better compression

13 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Columnar database management

Vertica slaughters Sybase in patent litigation

There sure seem to be a lot of inaccuracies on ParAccel’s website

This and that

Boston Big Data Summit keynote outline

Calpont’s InfiniDB

Introduction to SenSage

Kickfire capacity and pricing

Greenplum is going hybrid columnar as well

Oracle and Vertica on compression and other physical data layout features

Oracle Exadata 2 capacity pricing

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin