October 17, 2010

Where ParAccel is at

Until recently, I was extremely critical of ParAccel’s marketing. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that’s now happening. On my recent California trip, I chatted with three ParAccel folks for a few hours. Based on that and other conversation, here’s the current ParAccel story as I understand it.
Read more

Categories: Benchmarks and POCs, Columnar database management, Database compression, Investment research and trading, Memory-centric data management, ParAccel, Solid-state memory, Storage, Vertica Systems

10 Comments

June 14, 2010

Best practices for analytic DBMS POCs

When you are selecting an analytic DBMS or appliance, most of the evaluation boils down to two questions:

How quickly and cost-effectively does it execute SQL?
What analytic functionality, SQL or otherwise, does it do a good job of executing?

And so, in undertaking such a selection, you need to start by addressing three issues:

What does “speed” mean to you?
What does “cost” mean to you?
What analytic functionality do you need anyway?

Categories: Benchmarks and POCs, Data warehousing, Exadata, Netezza, ParAccel, Teradata

7 Comments

June 11, 2010

Ingres VectorWise technical highlights

After working through problems w/ travel, cell phones, and so on, Peter Boncz of VectorWise finally caught up with me for a regrettably brief call. Peter gave me the strong impression that what I’d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights included: Read more

Categories: Actian and Ingres, Analytic technologies, Benchmarks and POCs, Columnar database management, Data warehousing, Database compression, Open source, VectorWise

2 Comments

April 16, 2010

Story of an analytic DBMS evaluation

One of our readers was kind enough to walk me through his analytic DBMS evaluation process. The story is:

The X Company (XCo) has a <1 TB database.
100s of XCo’s customers log in at once to run reports. 50-200 concurrent queries is a good target number.
XCo had been “suffering” with Oracle and wanted to upgrade.
XCo didn’t have a lot of money to spend. Netezza pulled out of the sales cycle early due to budget (and this was recently enough that Netezza Skimmer could have been bid).
Greenplum didn’t offer any references that approached the desired number of concurrent users.
Ultimately the evaluation came down to Vertica and ParAccel.
Vertica won.

Notes on the Vertica vs. ParAccel selection include: Read more

Categories: Analytic technologies, Benchmarks and POCs, Buying processes, Data warehousing, Greenplum, Netezza, Oracle, ParAccel, Vertica Systems

7 Comments

April 12, 2010

Greenplum Chorus and Greenplum 4.0

Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year’s EDC (Enterprise Data Cloud) vision statement and marketing campaign.

Greenplum 4.0 highlights and related observations include: Read more

Categories: Analytic technologies, Benchmarks and POCs, Data integration and middleware, Data warehousing, EAI, EII, ETL, ELT, ETLT, Greenplum, Market share and customer counts, Petabyte-scale data management, Specific users, Telecommunications, Theory and architecture

5 Comments

March 18, 2010

XtremeData update

I talked with Geno Valente of XtremeData tonight. Highlights included:

XtremeData still hasn’t sold any dbX stuff (they’ve had a side business in generic FPGA-based boards paying the bills for years). Well, there may have been some paid POCs (proofs of concept) or something, but real sales haven’t come through yet.
XtremeData does have three prospects who have said “Yes”, and expects one order to come through this month.
XtremeData continues to believe it shines when:
- Data models are complex
- In particular, there are complex joins
- In particular, two large tables have to be joined with each other, under circumstances where no product can avoid doing vast data redistribution
XtremeData insists that all the nice things Bill Inmon – including in webinars — has said about it has not been for pay or other similar business compensation. That’s quite unusual.
XtremeData is coming out with a new product, codenamed the Personal Data Warehouse (PDW), which:
- Is ready to go into beta test
- Should be launched in a month and a half or so
- Will have a different name when it is launched

Naming aside, Read more

Categories: Analytic technologies, Benchmarks and POCs, Data warehouse appliances, Data warehousing, Database compression, Kickfire, Market share and customer counts, Netezza, Pricing, XtremeData

5 Comments

October 30, 2009

A question on MDX performance

An enterprise user wrote in with a question that boils down to:

What are reasonable MDX performance expectations?

MDX doesn’t come up in my life very much, and I don’t have much intuition about it. E.g., I don’t know whether one can slap an MDX-to-SQL converter on top of a fast analytic RDBMS and go to town. What’s more, I’m heading off on vacation and don’t feel like researching the matter myself in the immediate future. 🙂

So here’s the long form of the question. Any thoughts?

I have a general question on assessing the performance of an OLAP technology using a set of MDX queries. I would be interested to know if there are any benchmark MDX performance tests/results comparing different OLAP technologies (which may be based on different underlying DBMS’s if appropriate) on similar hardware setup, or even comparisons of complete appliance solutions. More generally, I want to determine what performance limits I could reasonably expect on what I think are fairly standard servers.

In my own work, I have set up a star schema model centered on a Fact table of 100 million rows (approx 60 columns), with dimensions ranging in cardinality from 5 to 10,000. In ad hoc analytics, is it expected that any query against such a dataset should return a result within a minute or two (i.e. before a user gets impatient), regardless of whether that query returns 100 cells or 50,000 cells (without relying on any aggregate table or caching mechanism)? Or is that level of performance only expected with a high end massively parallel software/hardware solution? The server specs I’m testing with are: 32-bit 4 core, 4GB RAM, 7.2k RPM SATA drive, running Windows Server 2003; 64-bit 8 core, 32GB RAM, 3 Gb/s SAS drive, running Windows Server 2003 (x64).

I realise that caching of query results and pre-aggregation mechanisms can significantly improve performance, but I’m coming from the viewpoint that in purely exploratory analytics, it is not possible to have all combinations of dimensions calculated in advance, in addition to being maintained.

Categories: Analytic technologies, Benchmarks and POCs, Data warehousing, MOLAP

16 Comments

September 30, 2009

Facts and rumors

Vertica is putting out a press release today touting its 100th customer, and talking of triple digit growth last year.
Multiple sources have told me that the DATAllegro system is being thrown out of Dell, so evidently Dell is telling this to one and all. If that goes through, this would presumably leave TEOCO as DATAllegro’s single happy customer. (I haven’t checked with Microsoft for its view.)
A rumor has it that Infiniband technology vendor Voltaire, Ltd. privately claims triple-digit sales of switches for Exadata 1 (I think that one would be one switch per Exadata installation, not per rack). Based just on a quick glance, this is far from confirmed by Voltaire’s earnings conference call transcripts or SEC filings. However, the most recent transcript does seem to indicate Voltaire got multiple Exadata deals in the telecommunications sector, and suggests some Exadata penetration in other sectors as well.
I was told of a classified-agency user that has >1 petabyte of data on Exadata 1 and 600 terabytes or so on Netezza. My not-obviously-biased source says the agency is distinctly happier with Netezza than Exadata.
Like ParAccel, Oracle just got dinged for TPC-related misbehavior.
Rumor has it that Sun has no intention of helping ParAccel rerun its withdrawn TPC-H benchmark.
ParAccel has withdrawn the claim from its home page to be the “CERTIFIED” price-performance leader. This seems to confirm that the claim was a reference to the TPC-H. In my opinion, that was a gross misrepresentation of what the TPC-H shows.

Categories: Benchmarks and POCs, Data warehouse appliances, Data warehousing, DATAllegro, Exadata, Market share and customer counts, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Petabyte-scale data management, Specific users, Telecommunications

3 Comments

July 27, 2009

XtremeData announces its DBx data warehouse appliance

XtremeData is announcing its DBx data warehouse appliance today. Highlights include: Read more

Categories: Benchmarks and POCs, Data warehouse appliances, Data warehousing, Pricing, XtremeData

34 Comments

July 8, 2009

While I’m venting about benchmarks

Late last year, Vertica made hoo-hah about what it called a world-record data warehouse load speed benchmark. I wrote at the time that this showed Vertica wasn’t painfully slow at loading, always a concern with column stores. But otherwise I mocked the idea that there was something useful to be learned from the whole exercise.

Well, guess what? In a throwaway line in a comment on Daniel Abadi’s blog, Barry Zane of ParAccel pointed out

we posted a load rate of almost 9TB/hour, which is, of course record breaking on its own

Quite right.

I hope the nonsense stops there, but I’m not optimistic …

Categories: Benchmarks and POCs, Columnar database management, Data integration and middleware, EAI, EII, ETL, ELT, ETLT, Vertica Systems

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in