Database compression

Analysis of technology that compresses data within a database management system. Related subjects include:

October 15, 2010

Notes on data warehouse appliance prices

I’m not terribly motivated to do a detailed analysis of data warehouse appliance list prices, in part because:

That said, here are some notes on data warehouse appliance prices. Read more

September 15, 2010

Aster Data nCluster Version 4.6

The main thing in Aster Data nCluster Version 4.6 is Aster’s version of hybrid row-column store technology. Technical highlights include:

So Aster Data has now joined Greenplum/EMC among row-based analytic DBMS vendors with hybrid row-column stores. Oracle will join them some day, and the same probably applies to other row-based vendors as well. Similarly, Aster Data will probably join Oracle some day in having columnar compression. And so this all fits the model:

Read more

August 18, 2010

More on temp space, compression, and “random” I/O

My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as “random” that clearly is not. That said, a comment by Shilpa Lawande on our recent flash/temp space discussion suggests the following way of framing a key point:

If everybody else is cool with it too, I can live with that. 🙂

Meanwhile, I talked again with Tim Vincent of IBM this afternoon. Tim endorsed the temp space/Flash fit, but with a different emphasis, which upon review I find I don’t really understand. The idea is:

My problem with that is: Flash typically has lower write than read IOPS (I/O per second), so being (relatively) write-intensive would, to a first approximation, seem if anything to disfavor a workload for flash.

On the plus side, I was reminded of something I should have noted when I wrote about DB2 compression before:

Much like Vertica, DB2 operates on compressed data all the way through, including in temp space.

August 16, 2010

Vertica’s innovative architecture for flash, plus more about temp space than you perhaps wanted to know

Vertica is announcing:

In other words, Vertica has succumbed to the common delusion that it’s a good idea to put out half-baked press releases the week of TDWI conferences. But if we look past that kind of all-too-common nonsense, Vertica is highlighting an interesting technical story, about how the analytic DBMS industry can exploit solid-state memory technology.

*Upgrades to Vertica FlexStore to handle flash memory, actually released as part of Vertica 4.0

** With Fusion I/O

To set the context, let’s recall a few points I’ve noted in the past:

Taken together, those points tell us:

For optimal price/performance, analytic DBMS should support databases that run part on flash, part on disk.

While all this is a future for some other analytic DBMS vendors, Vertica is shipping it today.* What’s more, three aspects of Vertica’s architecture make it particularly well-suited for hybrid flash/disk storage, in each case for a similar reason – you can get most of the performance benefit of all-flash for a relatively low actual investment in flash chips:  Read more

June 21, 2010

The Netezza and IBM DB2 approaches to compression

Thursday, I spent 3 ½ hours talking with 10 of Netezza’s more senior engineers. Friday, I talked for 1 ½ hours with IBM Fellow and DB2 Chief Architect Tim Vincent, and we agreed we needed at least 2 hours more. In both cases, the compression part of the discussion seems like a good candidate to split out into a separate post. So here goes.

When you sell a row-based DBMS, as Netezza and IBM do, there are a couple of approaches you can take to compression. First, you can compress the blocks of rows that your DBMS naturally stores. Second, you can compress the data in a column-aware way. Both Netezza and IBM have chosen completely column-oriented compression, with no block-based techniques entering the picture to my knowledge. But that’s about as far as the similarity between Netezza and IBM compression goes.  Read more

June 21, 2010

Netezza’s silicon balance

As I’ve mentioned in a couple of other posts, Netezza is stressing that the most recent wave of its technology is software-only, with no hardware upgrades made or needed. In other words, Netezza boxes already have all the silicon they need. But of course, there are really at least three major aspects to the Netezza silicon story – FPGA (Field-Programmable Gate Array), CPU, and RAM.

The major parts of Netezza’s FPGA software are:

If I understood correctly, each Netezza FPGA has two each of the engines in parallel.

Related link

June 12, 2010

The underlying technology of QlikView

QlikTech* finally decided both to become a client and, surely not coincidentally, to give me more technical detail about QlikView than it had when last we talked a couple of years ago. Indeed, I got to spend a couple of hours on the phone not just with Anthony Deighton, but also with QlikTech’s Hakan Wolge, who wrote 70-80% of the code in QlikView 1.0, and remains in effect QlikTech’s chief architect to this day.

*Or, as it now appears to be called, Qlik Technologies.

Let’s start with some quick reminders:

Let’s also dispose of one confusion right up front, namely QlikTech’s use of the word associative:  Read more

June 11, 2010

Ingres VectorWise technical highlights

After working through problems w/ travel, cell phones, and so on, Peter Boncz of VectorWise finally caught up with me for a regrettably brief call. Peter gave me the strong impression that what I’d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights included:  Read more

June 5, 2010

Algebraix

I talked Friday with Chris Piedemonte and Gary Sherman, respectively the Cofounder/CTO and Chief Mathematician of Algebraix, who hooked up together for this project back in 2003 or 2004. (Algebraix is the company formerly known as XSPRADA.) Algebraix makes an analytic DBMS, somewhat based on the ideas of extended set theory, that runs on SMP (Symmetric MultiProcessing) boxes. Like all analytic DBMS vendors, Algebraix has on some occasions run some queries orders of magnitude faster than they ran on the systems users were looking to replace.

Algebraix’s secret sauce is that the DBMS keeps reorganizing and recopying the data on disk, to optimize performance in response to expected query patterns (automatically inferred from queries it’s seen so far). This sounds a lot like the Infobright story, with some of the more obvious differences being:  Read more

May 23, 2010

More on Sybase IQ, including Version 15.2

Back in March, Sybase was kind enough to give me permission to post a slide deck about Sybase IQ. Well, I’m finally getting around to doing so. Highlights include but are not limited to:

Sybase IQ may have a bit of a funky architecture (e.g., no MPP), but the age of the product and the substantial revenue it generates have allowed Sybase to put in a bunch of product features that newer vendors haven’t gotten around to yet.

More recently, Sybase volunteered permission for me to preannounce Sybase IQ Version 15.2 by a few days (it’s scheduled to come out this week). Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.