Aster Data

Analysis of data warehouse DBMS vendor Aster Data. Related subjects include:

January 12, 2009

Gartner’s 2008 data warehouse database management system Magic Quadrant is out

February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.

Gartner’s annual Magic Quadrant for data warehouse DBMS is out. Thankfully, vendors don’t seem to be taking it as seriously as usual, so I didn’t immediately hear about it. (I finally noticed it in a Greenplum pay-per-click ad.) Links to Gartner MQs tend to come and go, but as of now here are two working links to the 2008 Gartner Data Warehouse Database Management System MQ. My posts on the 2007 and 2006 MQs have also been updated with working links. Read more

Categories: 1010data, Actian and Ingres, Aster Data, Data warehousing, Exadata, Greenplum, HP and Neoview, IBM and DB2, illuminate Solutions, Microsoft and SQL*Server, MySQL, Netezza, Oracle, SAND Technology, Sybase, Teradata, Vertica Systems

12 Comments

November 15, 2008

High-performance analytics

For the past few months, I’ve collected a lot of data points to the effect that high-performance analytics – i.e., beyond straightforward query — is becoming increasingly important. And I’ve written about some of them at length. For example:

MapReduce – controversial or in some cases even disappointing though it may be – has a lot of use cases.
It’s early days, but Netezza and Teradata (and others) are beefing up their geospatial analytic capabilities.
Memory-centric analytics is in the spotlight.

Ack. I can’t decide whether “analytics” should be a singular or plural noun. Thoughts?

Another area that’s come up which I haven‘t blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including: Read more

Categories: Aster Data, Data warehousing, EAI, EII, ETL, ELT, ETLT, Greenplum, MapReduce, Netezza, Oracle, Parallelization, SAS Institute, Teradata

6 Comments

November 7, 2008

Big scientific databases need to be stored somehow

A year ago, Mike Stonebraker observed that conventional DBMS don’t necessarily do a great job on scientific data, and further pointed out that different kinds of science might call for different data access methods. Even so, some of the largest databases around are scientific ones, and they have to be managed somehow. For example:

Microsoft just put out an overwrought press release. The substance seems to be that Pan-STARRS — a Jim Gray legacy also discussed in an August, 2008 Computerworld article — is adding 1.4 terabytes of image data per night, and one not so new database adds 15 terabytes per year of some kind of computer simulation output used to analyze protein folding. Both run on SQL Server, of course.
Kognitio has an astronomical database too, at Cambridge University, adding 1/2 a terabyte of data per night.
Oracle is used for a McGill University proteonomics database called CellMapBase. A figure of 50 terabytes of “mass storage” is included, which doesn’t include tape backup and so on.
The Large Hadron Collider, once it actually starts functioning, is projected to generate 15 petabytes of data annually, which will be initially stored on tape and then distributed to various computing centers around the world.
Netezza is proud of its ability to serve images and the like quickly, although off the top of my head I’m not thinking of a major customer it has in that area. (But then, if you just sell software, your academic discount can approach 100%; but if like Netezza you have an actual cost of goods sold, that’s not as appealing an option.)

Long-term, I imagine that the most suitable DBMS for these purposes will be MPP systems with strong datatype extensibility — e.g., DB2, PostgreSQL-based Greenplum, PostgreSQL-based Aster nCluster, or maybe Oracle.

Categories: Aster Data, Data types, Greenplum, IBM and DB2, Kognitio, Microsoft and SQL*Server, Netezza, Oracle, Parallelization, PostgreSQL, Scientific research

1 Comment

October 22, 2008

Update on Aster Data Systems and nCluster

I spent a few hours at Aster Data on my West Coast swing last week, which has now officially put out Version 3 of nCluster. Highlights included: Read more

Categories: Application areas, Aster Data, Data warehousing, Database compression, MapReduce, Market share and customer counts, Parallelization, Specific users, Theory and architecture, Web analytics

3 Comments

October 9, 2008

Aster Data on online marketing data warehousing

Aster Data’s blog is getting to be like Vertica’s, in that I find myself recommending a large fraction of its posts.

The virtue of the latest one is that it strings together several customer examples in related areas of online marketing (which is pretty much the only sector Aster has so far sold into). I’ve tended to overgeneralize a bit, and use terms like “web analytics” or “clickstream analysis” even when they don’t wholly apply. The Aster post is a good antidote to that.

Categories: Application areas, Aster Data, Data warehousing, Web analytics

1 Comment

October 7, 2008

Aster Data has a new release

Aster and I got our scheduling signals crossed, and I haven’t been briefed in detail yet. But Aster Data has a new release, and as usual is doing a great job telling their story in their own blog. The post summarizing nCluster 3.0 is here.

Categories: Aster Data

Web analytics — clickstream and network event data

It should surprise nobody that web analytics – and specifically clickstream data — is one of the biggest areas for high-end data warehousing. For example:

I believe that both of the previously mentioned petabyte+ databases on Greenplum will feature clickstream data.
Aster Data’s largest disclosed database, by almost two orders of magnitude, is at MySpace.
Clickstream analytics is a big application area for Vertica Systems.
Clickstream analytics is a big application area for Netezza.
Infobright’s customer success stories appear to be concentrated in clickstream analytics.
Coral8 tells me that CEP is also being used for clickstream data, although I suspect that a lot of Coral8’s evidence in that regard comes from a single flagship account. Edit: Actually, Coral8 has a bunch of clickstream customers.

Categories: Aleri and Coral8, Aster Data, Greenplum, Infobright, Netezza, Streaming and complex event processing (CEP), Vertica Systems, Web analytics

2 Comments

September 5, 2008

Dividing the data warehousing work among MPP nodes

I talk with lots of vendors of MPP data warehouse DBMS. I’ve now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives.

Categories: Aster Data, Calpont, Exasol, Greenplum, Parallelization, Theory and architecture, Vertica Systems

22 Comments

September 5, 2008

Three different implementations of MapReduce

So far as I can see, there are three implementations of MapReduce that matter for enterprise analytic use – Hadoop, Greenplum’s, and Aster Data’s.* Hadoop has of course been available for a while, and used for a number of different things, while Greenplum’s and Aster Data’s versions of MapReduce – both in late-stage beta – have far fewer users.

*Perhaps Nokia’s Disco or another implementation will at some point join the list.

Earlier this evening I posted some Mike Stonebraker criticisms of MapReduce. It turns out that they aren’t all accurate across all MapReduce implementations. So this seems like a good time for me to stop stalling and put up a few notes about specific features of different MapReduce implementations. Here goes. Read more

Categories: Aster Data, Greenplum, MapReduce

3 Comments

September 2, 2008

Introduction to Aster Data and nCluster

I’ve been writing a lot about Greenplum since a recent visit. But on the same trip I met with Aster Data, and have talked with them further since. Let me now redress the balance and outline some highlights of the Aster Data story.

Categories: Analytic technologies, Aster Data, Data warehousing, Parallelization, Specific users

4 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in