Columnar database management

Analysis of products and issues in column-oriented database management systems. Related subjects include:

April 18, 2011

Endeca topics

I visited my then-clients at Endeca in January. We focused on underpinnings (and strategic counsel) more than on coolness in what the product actually does. But going over my notes I think there’s enough to write up now.

Before saying much else about Endeca, there’s one confusion to dispose of: What’s the relationship between Endeca’s efforts in e-commerce (helping shoppers navigate websites) and business intelligence (helping people navigate their own data)? As Endeca tells it:

Endeca’s e-commerce and business intelligence efforts are reflections of the same technical approach. Indeed, I’m pretty sure Endeca’s product lines still share much/most of the same technology.
Endeca went after e-commerce first because that’s where the provable ROI was. As I pointed out a couple of times in 2007, Endeca became a market leader in that area.
Endeca increased its BI efforts later.
Circa 2009-10, Endeca differentiated its e-commerce and BI product lines from each other.
- An e-commerce line extension called Page Builder is what really got Endeca through the recent recession.
- The BI product line Latitude was launched in the fall of 2010.

Endeca’s positioning in the business intelligence market boils down to “investigative analytics for people who aren’t hardcore analysts.” Endeca’s technological support for that stresses: Read more

Categories: Business intelligence, Columnar database management, Database compression, Endeca

11 Comments

March 4, 2011

Teradata, Aster Data, and Teradata/Aster

Teradata is acquiring Aster Data. Naturally, the deal is being presented with a Treaty of Tordesillas kind of positioning — Teradata does X, Aster Data does Y, and everybody looks forward to having X and Y in the same product portfolio. That said, my initial positioning and product strategy thoughts on the Teradata/Aster combination go something like this. Read more

Categories: Analytic technologies, Aster Data, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, RDF and graphs, Specific users, Teradata

9 Comments

February 11, 2011

Comments on the 2011 Forrester Wave for Enterprise Data Warehouse Platforms

The Forrester Wave: Enterprise Data Warehouse Platforms, Q1 2011 is now out,* hot on the heels of the Gartner Magic Quadrant. Unfortunately, this particular Forrester Wave is riddled with inaccuracy. Read more

Categories: Analytic technologies, Columnar database management, Data warehousing, EMC, Exadata, Greenplum, Netezza, Oracle, Pricing, SAP AG, Sybase, Teradata, Vertica Systems

8 Comments

February 6, 2011

Columnar compression vs. column storage

I’m getting the increasing impression that certain industry observers, such as Gartner, are really confused about columnar technology. (I further suspect that certain vendors are encouraging this confusion, as vendors commonly do.) So here are some basic points.

A simple way to think about the difference between columnar storage and columnar (or any other kind of) compression is this:

Columnar storage is a reference to how data is grouped together on disk (or in solid-state memory).
(Columnar) compression is a reference to whether the actual data is on disk, or whether you save space by storing some smaller substitute for the actual data.

Specifically, if data in a relational table is grouped together according to what row it’s in, then the database manager is called “row-based” or a “row store.” If it’s grouped together according to what column it’s in, then the database management system is called “columnar” or a “column store.” Increasingly, row-based and columnar storage are being hybridized.

There are two main kinds of compression — compression of bit strings and more intelligent compression of actual data values. Compression of actual data values can reasonably be called “columnar,” in that different columns of data can be compressed in different ways, often depending only on the data in that column.* Read more

Categories: Columnar database management, Data warehousing, Database compression, Exadata, Vertica Systems

21 Comments

February 5, 2011

Comments on the Gartner 2010/2011 Data Warehouse Database Management Systems Magic Quadrant

Edit: Comments on the February, 2012 Gartner Magic Quadrant for Data Warehouse Database Management Systems — and on the companies reviewed in it — are now up.

The Gartner 2010 Data Warehouse Database Management Systems Magic Quadrant is out. I shall now comment, just as I did to varying degrees on the 2009, 2008, 2007, and 2006 Gartner Data Warehouse Database Management System Magic Quadrants.

Note: Links to Gartner Magic Quadrants tend to be unstable. Please alert me if any problems arise; I’ll edit accordingly.

In my comments on the 2008 Gartner Data Warehouse Database Management Systems Magic Quadrant, I observed that Gartner’s “completeness of vision” scores were generally pretty reasonable, but their “ability to execute” rankings were somewhat bizarre; the same remains true this year. For example, Gartner ranks Ingres higher by that metric than Vertica, Aster Data, ParAccel, or Infobright. Yet each of those companies is growing nicely and delivering products that meet serious cutting-edge analytic DBMS needs, neither of which has been true of Ingres since about 1987. Read more

Categories: 1010data, Actian and Ingres, Analytic technologies, Aster Data, Benchmarks and POCs, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, EMC, Exadata, Greenplum, illuminate Solutions, Infobright, Microsoft and SQL*Server, Netezza, Open source, ParAccel, Pricing, SAND Technology, Storage, Sybase, Teradata, Vertica Systems, Workload management

23 Comments

February 2, 2011

Exadata notes

It’s been a while since I penetrated Oracle’s tight message control and actually talked with them about Exadata. But Doug Henschen wrote a good article about Exadata based on an Andy Mendelsohn webcast. I agree with almost all of it. At first I was a little surprised that Exadata’s emphasis shift from data warehousing to OLTP/generic consolidation hasn’t gone more quickly, but on the other hand:

On the data warehouse side Exadata can alleviate screaming pain points.
In OLTP consolidation, Exadata mainly can save money. (Yes, I just said a product from Oracle can save customers money, and I meant it. You may stop laughing at any time.)

Doug did overstate when he said that columnar architectures give 100X or more compression. That doesn’t happen. Yes, columnar compression can be >10X in a variety of use cases, while pre-Exadata Oracle index bloat can approach 10X at times; but even if you’re counting that way I doubt there are many instances in which it actually multiplies out to >100.

In other Exadata news, the long-standing observation that Oracle doesn’t like to do on-site Exadata POCs still holds true. A couple of existing Oracle users — one rather well-known — recently told me that Oracle won’t let them text Exadata except on Oracle premises. In one case, this is a deal-breaker keeping Exadata from being considered for a purchase, and Oracle still won’t budge.

Finally, I’m pretty sure that this “new” Softbank Teradata replacement Oracle has been touting since September as competitive evidence — which Doug’s article also references — isn’t quite what it sounds like. I believe Teradata’s version of the story, which somewhat edited goes like this: Read more

Categories: Benchmarks and POCs, Columnar database management, Data warehouse appliances, Database compression, Exadata, Oracle, Teradata

26 Comments

January 18, 2011

Architectural options for analytic database management systems

Mike Stonebraker recently kicked off some discussion about desirable architectural features of a columnar analytic DBMS. Let’s expand the conversation to cover desirable architectural characteristics of analytic DBMS in general. Read more

Categories: Analytic technologies, Aster Data, Benchmarks and POCs, Columnar database management, Data pipelining, Data warehousing, Database compression, Exadata, Michael Stonebraker, Oracle, Solid-state memory, Theory and architecture

5 Comments

January 12, 2011

Mike Stonebraker on “real column stores”

Mike Stonebraker has a post up on Vertica’s blog trying to differentiate “real” from “pretend” column stores. (Edit: That post seems to have come back down, but as of 1/19 it can be found in Google Cache.) In essence, Mike argues that the One Right Way to design a column store is Vertica’s, a position that Daniel Abadi used to share but since has retreated from.

There are some good things about that post, and some not-so-good. The worst paragraph is probably

Several row-store vendors (including Oracle, Greenplum and Aster Data) now claim to be selling a column store. Obviously, this would require a complete rewrite of a DBMS to move from Figure 1 to Figure 2. Hence, none of the “pretenders” have actually done this. Instead all have implemented some aspects of column stores, and then claim to be the real thing. This blog defines what the “real enchilada” looks like, and how to tell it from the pretenders.

which I question on two levels. Read more

Categories: Aster Data, Columnar database management, Database compression, Michael Stonebraker, Sybase, Theory and architecture, Vertica Systems

24 Comments

October 17, 2010

Where ParAccel is at

Until recently, I was extremely critical of ParAccel’s marketing. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that’s now happening. On my recent California trip, I chatted with three ParAccel folks for a few hours. Based on that and other conversation, here’s the current ParAccel story as I understand it.
Read more

Categories: Benchmarks and POCs, Columnar database management, Database compression, Investment research and trading, Memory-centric data management, ParAccel, Solid-state memory, Storage, Vertica Systems

10 Comments

September 15, 2010

Aster Data nCluster Version 4.6

The main thing in Aster Data nCluster Version 4.6 is Aster’s version of hybrid row-column store technology. Technical highlights include:

Aster Data is simply taking the number of storage options in nCluster up from 1 to 2 – you now can store a table either in the Aster Data nCluster row store or column store.
In fact, you can store parts of a table in the Aster Data nCluster row store and other parts in the Aster Data nCluster column store. I‘m a bit foggy on the details of that – Aster makes discussions of partitioning more complicated than they need to be — but it definitely sounds pretty flexible. Edit: See comment thread below.
Anything you can do with the Aster Data nCluster row store you can also do with the Aster Data nCluster column store. In particular, that includes all of Aster Data’s analytic functionality.
The same is true vice-versa. There is no columnar-oriented kind of compression in Aster Data nCluster at this time.

So Aster Data has now joined Greenplum/EMC among row-based analytic DBMS vendors with hybrid row-column stores. Oracle will join them some day, and the same probably applies to other row-based vendors as well. Similarly, Aster Data will probably join Oracle some day in having columnar compression. And so this all fits the model:

Aster Data has an impressively competitive analytic relational DBMS, considering the youth and size of the company.
Aster Data is a leader in extending its analytic relational DBMS by integrating in other analytic processing capabilities.

Categories: Analytic technologies, Aster Data, Columnar database management, Data warehousing, Database compression

4 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in