October 17, 2010

Where ParAccel is at

Until recently, I was extremely critical of ParAccel’s marketing. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that’s now happening. On my recent California trip, I chatted with three ParAccel folks for a few hours. Based on that and other conversation, here’s the current ParAccel story as I understand it.
Read more

June 14, 2010

Best practices for analytic DBMS POCs

When you are selecting an analytic DBMS or appliance, most of the evaluation boils down to two questions:

And so, in undertaking such a selection, you need to start by addressing three issues:

Read more

June 11, 2010

Ingres VectorWise technical highlights

After working through problems w/ travel, cell phones, and so on, Peter Boncz of VectorWise finally caught up with me for a regrettably brief call. Peter gave me the strong impression that what I’d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights included:  Read more

April 16, 2010

Story of an analytic DBMS evaluation

One of our readers was kind enough to walk me through his analytic DBMS evaluation process. The story is:

Notes on the Vertica vs. ParAccel selection include: Read more

April 12, 2010

Greenplum Chorus and Greenplum 4.0

Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year’s EDC (Enterprise Data Cloud) vision statement and marketing campaign.

Greenplum 4.0 highlights and related observations include: Read more

March 18, 2010

XtremeData update

I talked with Geno Valente of XtremeData tonight. Highlights included:

Naming aside, Read more

October 30, 2009

A question on MDX performance

An enterprise user wrote in with a question that boils down to:

What are reasonable MDX performance expectations?

MDX doesn’t come up in my life very much, and I don’t have much intuition about it. E.g., I don’t know whether one can slap an MDX-to-SQL converter on top of a fast analytic RDBMS and go to town. What’s more, I’m heading off on vacation and don’t feel like researching the matter myself in the immediate future. 🙂

So here’s the long form of the question. Any thoughts?

I have a general question on assessing the performance of an OLAP technology using a set of MDX queries. I would be interested to know if there are any benchmark MDX performance tests/results comparing different OLAP technologies (which may be based on different underlying DBMS’s if appropriate) on similar hardware setup, or even comparisons of complete appliance solutions. More generally, I want to determine what performance limits I could reasonably expect on what I think are fairly standard servers.

In my own work, I have set up a star schema model centered on a Fact table of 100 million rows (approx 60 columns), with dimensions ranging in cardinality from 5 to 10,000. In ad hoc analytics, is it expected that any query against such a dataset should return a result within a minute or two (i.e. before a user gets impatient), regardless of whether that query returns 100 cells or 50,000 cells (without relying on any aggregate table or caching mechanism)? Or is that level of performance only expected with a high end massively parallel software/hardware solution? The server specs I’m testing with are: 32-bit 4 core, 4GB RAM, 7.2k RPM SATA drive, running Windows Server 2003; 64-bit 8 core, 32GB RAM, 3 Gb/s SAS drive, running Windows Server 2003 (x64).

I realise that caching of query results and pre-aggregation mechanisms can significantly improve performance, but I’m coming from the viewpoint that in purely exploratory analytics, it is not possible to have all combinations of dimensions calculated in advance, in addition to being maintained.

September 30, 2009

Facts and rumors

July 27, 2009

XtremeData announces its DBx data warehouse appliance

XtremeData is announcing its DBx data warehouse appliance today. Highlights include: Read more

July 8, 2009

While I’m venting about benchmarks

Late last year, Vertica made hoo-hah about what it called a world-record data warehouse load speed benchmark.  I wrote at the time that this showed Vertica wasn’t painfully slow at loading, always a concern with column stores. But otherwise I mocked the idea that there was something useful to be learned from the whole exercise.

Well, guess what?  In a throwaway line in a comment on Daniel Abadi’s blog, Barry Zane of ParAccel pointed out

we posted a load rate of almost 9TB/hour, which is, of course record breaking on its own

Quite right.

I hope the nonsense stops there, but I’m not optimistic …

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.