Columnar database management

Analysis of products and issues in column-oriented database management systems. Related subjects include:

April 18, 2008

Kickfire kicks off

I chatted with Raj Cherabuddi and others on the Kickfire (formerly C2) team for over an hour on Monday, and now have a better sense of their story. There are some very basic questions I still don’t have answers to; I’ll fill those in when I can.

Highlights of what I have and haven’t figured out so far include:

*Somebody – perhaps adman extraordinaire Rick Bennett? — may want to check my memory on this, but I think Oracle’s famed “Gentlemen, start your snails” ad in the early 1990s was about PC World tests, not TPCs. Oracle also had an ad about WW1-style planes nosediving, but I don’t think those referenced TPCs either.

March 4, 2008

Odd article on Sybase IQ and columnar systems

Intelligent Enterprise has an article on Sybase IQ and columnar systems that leaves me shaking my head. E.g., it ends by saying Netezza has a columnar architecture (uh, no). It also quotes an IBM exec as saying only 10-20% of what matters in a data warehouse DBMS is performance (already an odd claim), and then has him saying columnar only provides a 10% performance gain (let’s be generous and hope that’s a misquote).

Also from the article — and this part seems more credible — is:

“Sybase IQ revenues were up 70% last year,” said Richard Pledereder, VP of engineering. … Sybase now claims 1,200 Sybase IQ customers. It runs large data warehouses powered by big, multiprocessor servers. Priced at $45,000 per CPU, those IQ customers now account for a significant share of Sybase’s revenues, although the company won’t break down revenues by market segment.

Read more

February 19, 2008

Mike Stonebraker may be oversimplifying data warehousing just a tad

Mike Stonebraker has now responded to the second post in my five-part database diversity series. Takeaways and rejoinders include: Read more

February 18, 2008

ParAccel technical highlights

I recently caught up with ParAccel’s CTO Barry Zane and Marketing VP Kim Stanick for a long technical discussion, which they have graciously continued by email. It would be impolitic in the extreme to comment on what led up to that. Let’s just note that many things I’ve previously written about ParAccel are now inoperative, and go straight to the highlights.

Read more

February 8, 2008

Load speeds and related issues in columnar DBMS

Please do not rely on the parts of the post below that are about ParAccel. See our February 18 post about ParAccel instead.

I’ve already posted about a chat I had with Mike Stonebraker regarding Vertica yesterday. I naturally raised the subject of load speed, unaware that Mike’s colleague Stan Zlodnik had posted at length about load speed the day before. Given that post, it seems timely to go into a bit more detail, and in particular to address three questions:

  1. Can columnar DBMS do operational BI?
  2. Can columnar DBMS do ELT (Extract-Load-Transform, as opposed to ETL)?
  3. Are columnar DBMS’ load speeds a problem other than in issues #1 and #2?

Read more

October 29, 2007

ParAccel opens the kimono slightly

Please do not rely on the parts of this post that draw a distinction between in-memory and disk-based operation. See our February 18, 2008 post about ParAccel instead. It turns out that communication with ParAccel was yet worse than I had realized.

Officially launched today at the TDWI conference, ParAccel is out to compete with Netezza. Right out of the chute, ParAccel may have surpassed Netezza in at least one area: pointlessly annoying secrecy. (In other regards I love them dearly, but that paranoia can be a real pain.) As best I can remember, here are some things about ParAccel that I both am allowed to say and find interesting:

Read more

October 28, 2007

Infobright responds

An InfoBright employee posted something quite reasonable-looking in response to my inaugaral post about BrightHouse. Even so, InfoBright asked if they could substitute something with a slightly different tone. I agreed. Here’s what they sent in.

Curt, thanks for the write-up and the opportunity to talk about our customer success stories. As you say, our customer story is definitely “more than zero.” We are addressing a number of critical customer issues with our unique approach to data warehousing.

Infobright currently has 5 customers – customers that have bucked the trend of throwing hardware at the problem. To be perfectly braggadocio about this, we have never lost a competitive proof of concept in which we’ve been engaged. This is accomplished with the horsepower of one box (though for redundancy customers may deploy multiple boxes with a load balancer). Read more

October 23, 2007

Vertica — just star and snowflake schemas?

One of the longest-running technotheological disputes I know of is the one pitting flat/normalized data warehouse architectures vs. cubes, stars, and snowflake schemas. Teradata, for example, is a flagwaver for the former camp; Microstrategy is firmly in the latter. (However, that doesn’t keep lots of retailers from running Microstrategy on Teradata boxes.) Attensity (a good Teradata partner) is in the former camp; text mining rival Clarabridge (sort of a Microstrategy spinoff) is in the latter. And so on.

Vertica is clearly in the star/snowflake camp as well. I asked them about this, and Vertica’s CTO Mike Stonebraker emailed a response. I’m reproducing it below, with light edits; the emphasis is also mine. Key points include:

Great question. This is something that we’ve thought a lot about and have done significant research on with large enterprise customers. … short answer is as follows:

Vertica supports star and snowflake schemas because that is the desired data structure for data warehousing. The overwhelming majority of the schemas we see are of this form, and we have highly optimized for this case. Read more

October 23, 2007

Vertica update

Vertica has been quietly selling product for three quarters and has about 50 customers.

Andy Ellicott of Vertica pointed me to the above Richard Hackathorn quote. Sadly, he asked me not to name and shame another analyst who foolishly said Vertica hadn’t “launched” yet.

But then, I understand. I’m also not going to identify the client who gave me fits by insisting on believing that nonsense, even in the face of the well-known facts that Vertica has shipping product, paying customers, and so on.

October 22, 2007

Infobright BrightHouse — columnar, VERY compressed, simple, and related to MySQL

To a first approximation, Infobright – maker of BrightHouse — is yet another data warehouse DBMS specialist with a columnar architecture, boasting great compression and running on commodity hardware, emphasizing easy set-up, simple administration, great price-performance, and hence generally low TCO. BrightHouse isn’t actually MPP yet, but Infobright confidently promises a generally available MPP version by the end of 2008. The company says that experience shows >10:1 compression of user data is realistic – i.e., an expansion ratio that’s fractional, and indeed better than 1/10:1. Accordingly, despite the lack of shared-nothing parallelism, Infobright claims a sweet spot of 1-10 terabyte warehouses, and makes occasional references to figures up to 30 terabytes or so of user data.

BrightHouse is essentially a MySQL storage engine, and hence gets a lot of connectivity and BI tool support features from MySQL for “free.” Beyond that, Infobright’s core technical idea is to chop columns of data into 64K chunks, called data packs, and then store concise information about what’s in the packs. The more basic information is stored in data pack nodes,* one per data pack. If you’re familiar with Netezza zone maps, data pack nodes sound like zone maps on steroids. They store maximum values, minimum values, and (where meaningful) aggregates, and also encode information as to which intervals between the min and max values do or don’t contain actual data values. Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.