Vertica Systems

Analysis of columnar data warehouse DBMS vendor Vertica Systems. Related subjects include:

October 23, 2007

Vertica update

Vertica has been quietly selling product for three quarters and has about 50 customers.

Andy Ellicott of Vertica pointed me to the above Richard Hackathorn quote. Sadly, he asked me not to name and shame another analyst who foolishly said Vertica hadn’t “launched” yet.

But then, I understand. I’m also not going to identify the client who gave me fits by insisting on believing that nonsense, even in the face of the well-known facts that Vertica has shipping product, paying customers, and so on.

October 19, 2007

Gartner 2007 Magic Quadrant for Data Warehouse Database Management Systems

February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.

It’s early autumn, the leaves are turning in New England, and Gartner has issued another Magic Quadrant for data warehouse DBMS(Edit: As of January, 2009, that link is dead but this one works.) The big winners vs. last year are Greenplum and, secondarily, Sybase. Teradata continues to lead. Oracle has also leapfrogged IBM, and there are various other minor adjustments as well, among repeat mentionees Netezza, DATAllegro, Sand, Kognitio, and MySQL. HP isn’t on the radar yet; ditto Vertica. Read more

September 19, 2007

Some pushback from DATAllegro against the columnar argument

I was chatting with Stuart Frost this evening (DATAllegro’s CEO). As usual, I grilled him about customer counts; as usual, he was evasive, but expressed general ebullience about the pace of business; also as usual, he was charming and helpful on other subjects.

In particular, we talked about the Vertica story, and he offered some interesting pushback. Part was blindingly obvious — Vertica’s not in the marketplace yet, when they are the product won’t be mature, and so on. Part was the also obvious “we can do most of that ourselves” line of argument, some of which I’ve summarized in a comment here. But he made two other interesting points as well. Read more

September 18, 2007

The core of the Vertica story still seems to be compression

Back in March, I suggested that compression was a central and compelling aspect of Vertica’s story. Well, in their new blog, the Vertica guys now strongly reinforce that impression.

I recommend those two Database Column posts (by Sam Madden) highly. I’ve rarely seen such a clear, detailed presentation of a company’s technical argument. My own thoughts on the subject boil down to:

September 6, 2007

The Vertica guys have their own blog now

I’ve written a considerable amount about Vertica and/or the opinions of Mike Stonebraker. Now the Vertica guys have their own blog, which they pledge will not just be a rehash of Vertica marketing pitches — notwithstanding the Vertica-related wordplay in the blog’s name.*

*Those guys are good at wordplay.

June 15, 2007

Fast RDF in specialty relational databases

When Mike Stonebraker and I discussed RDF yesterday, he quickly turned to suggesting fast ways of implementing it over an RDBMS. Then, quite characteristically, he sent over a paper that allegedly covered them, but actually was about closely related schemes instead. 🙂 Edit: The paper has a new, stable URL. Hat tip to Daniel Abadi.

All minor confusion aside, here’s the story. At its core, an RDF database is one huge three-column table storing subject-property-object triples. In the naive implementation, you then have to join this table to itself repeatedly. Materialized views are a good start, but they only take you so far. Read more

June 14, 2007

Bracing for Vertica

The word from Vertica is that the product will go GA in the fall, and that they’ll have blow-out benchmarks to exhibit.

I find this very credible. Indeed, the above may even be something of an understatement.

Vertica’s product surely has some drawbacks, which will become more apparent when the product is more available for examination. So I don’t expect row-based appliance innovators Netezza and DATAllegro to just dry up and blow away. On the other hand, not every data warehousing product is going to live long and prosper, and I’d rate Vertica’s chances higher than those of several competitors that are actually already in GA.

March 24, 2007

Mike Stonebraker on database compression — comments

In my opinion, the key part of Mike Stonebraker’s fascinating note on data compression was (emphasis mine):

The standard wisdom in most row stores is to use block compression. Hence, a storage block is compressed using a single technique (say Lempel-Ziv or dictionary). The technique chosen then compresses all the attributes in all the columns which occur on the block. In contrast, Vertica compresses a storage block that only contains one attribute. Hence, it can use a different compression scheme for each attribute. Obviously a compression scheme that is type-specific will beat an implementation that is “one size fits all”.

It is possible for a row store to use a type-specific compression scheme. However, if there are 50 attributes in a record, then it must remember the state for 50 type-specific implementations, and complexity increases significantly.

In addition, all row stores we are familiar with decompress each storage block on access, so that the query executor processes uncompressed tuples. In contrast, the Vertica executor processes compressed tuples. This results in better L2 cache locality, less main memory copying and generally much better performance.

Of course, any row store implementation can rewrite their executor to run on compressed data. However, this is a rewrite – and a lot of work.

Read more

March 24, 2007

Mike Stonebraker explains column-store data compression

The following is by Mike Stonebraker, CTO of Vertica Systems, copyright 2007, as part of our ongoing discussion of data compression. My comments are in a separate post.

Row Store Compression versus Column Store Compression

I Introduction

There are three aspects of space requirements, which we discuss in this short note, namely:

structural space requirements

index space requirements

attribute space requirements.

Read more

March 21, 2007

Compression in columnar data stores

We have lively discussions going on columnar data stores vs. vertically partitioned row stores. Part is visible in the comment thread to a recent post. Other parts come in private comments from Stuart Frost of DATAllegro and Mike Stonebraker of Vertica et al.

To me, the most interesting part of what the Vertica guys are saying is twofold. One is that data compression just works better in column stores than row stores, perhaps by a factor of 3, because “the next thing in storage is the same data type, rather than a different one.” Frankly, although Mike has said this a couple of times, I haven’t understood yet why row stores can’t be smart enough to compress just as well. Yes, it’s a little harder than it would be in a columnar system; but I don’t see why the challenge would be insuperable.

The second part is even cooler, namely the claim that column stores allow the processors to operate directly on compressed data. But once again, I don’t see why row stores can’t do that too. For example, when you join via bitmapped indices, exactly what you’re doing is operating on highly-compressed data.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.