Aster Data

Analysis of data warehouse DBMS vendor Aster Data. Related subjects include:

February 5, 2011

Comments on the Gartner 2010/2011 Data Warehouse Database Management Systems Magic Quadrant

Edit: Comments on the February, 2012 Gartner Magic Quadrant for Data Warehouse Database Management Systems — and on the companies reviewed in it — are now up.

The Gartner 2010 Data Warehouse Database Management Systems Magic Quadrant is out. I shall now comment, just as I did to varying degrees on the 2009, 2008, 2007, and 2006 Gartner Data Warehouse Database Management System Magic Quadrants.

Note: Links to Gartner Magic Quadrants tend to be unstable. Please alert me if any problems arise; I’ll edit accordingly.

In my comments on the 2008 Gartner Data Warehouse Database Management Systems Magic Quadrant, I observed that Gartner’s “completeness of vision” scores were generally pretty reasonable, but their “ability to execute” rankings were somewhat bizarre; the same remains true this year. For example, Gartner ranks Ingres higher by that metric than Vertica, Aster Data, ParAccel, or Infobright. Yet each of those companies is growing nicely and delivering products that meet serious cutting-edge analytic DBMS needs, neither of which has been true of Ingres since about 1987. Read more

Categories: 1010data, Actian and Ingres, Analytic technologies, Aster Data, Benchmarks and POCs, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, EMC, Exadata, Greenplum, illuminate Solutions, Infobright, Microsoft and SQL*Server, Netezza, Open source, ParAccel, Pricing, SAND Technology, Storage, Sybase, Teradata, Vertica Systems, Workload management

23 Comments

January 24, 2011

Choices in analytic computing system design

When I posted a long list of architectural options for analytic DBMS, I left a couple of IOUs in for missing parts. One was in the area of what is sometimes called advanced-analytics functionality, which roughly speaking means aspects of analytic database management systems that are not directly related to conventional* SQL queries.

*Main examples of “conventional” = filtering, simple aggregrations.

The point of such functionality is generally twofold. First, it helps you execute analytic algorithms with high performance, due to reducing data movement and/or executing the analytics in parallel. Second, it helps you create and execute sophisticated analytic processes with (relatively) little effort.

For now, I’m going to refer to an analytic RDBMS that has been extended by advanced-analytics functionality as an analytic computing system, rather than as some kind of “platform,” although I suspect the latter term is more likely to wind up winning. So far, there have been five major categories of subsystem or add-on module that contribute to making an analytic DBMS a more fully-fledged analytic computing system:

SQL extensions. Examples include SQL-2003 analytics (notably windowing), or vendor-specific temporal functionality.
A framework for UDFs (User-Defined Functions) to further extend SQL. At its core, a relational DBMS is a big SQL interpreter. SQL, while powerful, only does a limited number of things. User-Defined Functions are new predicates in the SQL language that do additional things.
An execution engine for analytic processes that is less coupled to the SQL engine than a pure UDF framework might be. The two main approaches are MapReduce (e.g. Aster Data) and general C++ libraries (Netezza, ParAccel).
Libraries of pre-built analytic processes. Commonly included are statistics, (other machine learning), general linear algebra, and Monte Carlo analysis. Some of these functions are fully parallelized (perhaps tens per vendor). Others just play nicely with the vendor’s execution framework, in that a separate copy can be run on each node (up to thousands per vendor, for those who bring in open source statistics libraries).
Development tools such as integrated development environments (IDEs). Aster keeps trying to convince me that having built a nice Eclipse IDE is a major competitive differentiation.

Categories: Aster Data, MapReduce, Netezza, ParAccel, Parallelization, Predictive modeling and advanced analytics, Workload management

8 Comments

January 19, 2011

Sound bites on HP/Microsoft and Neoview

HP and Microsoft put out a press release. Three new appliances are being announced, and we’re being reminded of at least one past announcement. I wasn’t briefed, and wouldn’t want to comment on, say, price/performance or feature particulars. That said:

HP Neoview seems pretty dead.
I haven’t heard a single favorable reference to HP Neoview since I remarked in March, 2010 that “HP Neoview is reeling.”
A reporter asked me “What went wrong?” Well, almost any new analytic DBMS/appliance product will compete mainly on two things in its early days — price/performance (or absolute performance), and just how (im)mature it initially is. (Aster Data may be the only prominent exception to that rule.) Presumably, HP Neoview did badly by those metrics.
HP Neoview was widely conjectured to be a pet project of ousted former HP CEO Mark Hurd.
Nobody tells me of competing with Microsoft SQL Server 2008 Parallel Data Warehouse either (i.e. Madison/DATallegro). Thus, in particular, I haven’t heard any reason to believe there’s anything good about the technology, especially now that the ever-upbeat Stuart Frost has left Microsoft. I’m conjecturing that Parallel Data Warehouse is focused heavily on the existing Microsoft installed base.
Speaking of Aster — even under NDA, they won’t tell me or give me any useful hints as to who their undisclosed strategic investor is. Well, HP has a long history of investing in sometimes-competing DBMS vendors (back to Oracle and Informix), and a good reason to keep quiet (reluctance to admit the end of Neoview). Hmm …
The consolidation appliance in the HP/Microsoft announcement is a clear response to Oracle’s Exadata strategy, or (which is probably more accurate) to the same market opportunity Oracle identified.
I couldn’t quite figure out whether the cheap data warehouse appliance included Microsoft PowerPivot support, but that would make sense if it did.

Categories: Aster Data, Data warehouse appliances, Data warehousing, HP and Neoview, Microsoft and SQL*Server

3 Comments

January 18, 2011

Architectural options for analytic database management systems

Mike Stonebraker recently kicked off some discussion about desirable architectural features of a columnar analytic DBMS. Let’s expand the conversation to cover desirable architectural characteristics of analytic DBMS in general. Read more

Categories: Analytic technologies, Aster Data, Benchmarks and POCs, Columnar database management, Data pipelining, Data warehousing, Database compression, Exadata, Michael Stonebraker, Oracle, Solid-state memory, Theory and architecture

5 Comments

January 12, 2011

Mike Stonebraker on “real column stores”

Mike Stonebraker has a post up on Vertica’s blog trying to differentiate “real” from “pretend” column stores. (Edit: That post seems to have come back down, but as of 1/19 it can be found in Google Cache.) In essence, Mike argues that the One Right Way to design a column store is Vertica’s, a position that Daniel Abadi used to share but since has retreated from.

There are some good things about that post, and some not-so-good. The worst paragraph is probably

Several row-store vendors (including Oracle, Greenplum and Aster Data) now claim to be selling a column store. Obviously, this would require a complete rewrite of a DBMS to move from Figure 1 to Figure 2. Hence, none of the “pretenders” have actually done this. Instead all have implemented some aspects of column stores, and then claim to be the real thing. This blog defines what the “real enchilada” looks like, and how to tell it from the pretenders.

which I question on two levels. Read more

Categories: Aster Data, Columnar database management, Database compression, Michael Stonebraker, Sybase, Theory and architecture, Vertica Systems

24 Comments

October 22, 2010

Notes and links October 22, 2010

A number of recent posts have had good comments. This time, I won’t call them out individually.

Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on “sensor data” …

… and, even better, went on to say: Read more

Categories: Analytic technologies, Aster Data, Cloudera, eBay, Greenplum, Hadoop, IBM and DB2, In-memory DBMS, Market share and customer counts, Netezza, Open source, Oracle, ParAccel, Petabyte-scale data management, SAS Institute, Surveillance and privacy, Teradata, VoltDB and H-Store

1 Comment

October 18, 2010

More notes on Membase and memcached

As a companion to my post about Membase last week, the company has graciously allowed me to post a rather detailed Membase slide deck. (It even has pricing.) Also, I left one point out.

Membase announced a Cloudera partnership. I couldn’t detect anything technically exciting about that, but it serves to highlight what I do find to be an interesting usage trend. A couple of big Web players (AOL and ShareThis) are using Hadoop to crunch data and derive customer profile data, then feed that back into Membase. Why Membase? Because it can serve up the profile in a millisecond, as part of a bigger 40-millisecond-latency request.

And why Hadoop, rather than Aster Data nCluster, which ShareThis also uses? Umm, I didn’t ask.

When I mentioned this to Colin Mahony, he said Vertica had similar stories. However, I don’t recall whether they were about Membase or just memcached, and he hasn’t had a chance to get back to me with clarification. (Edit: As per Colin’s comment below, it’s both.)

Categories: Aster Data, Cache, Cloudera, Couchbase, Hadoop, memcached, Memory-centric data management, NoSQL, Pricing, Specific users, Vertica Systems, Web analytics

7 Comments

October 10, 2010

Partnering with Cloudera

After I criticized the marketing of the Aster/Cloudera partnership, my clients at Aster Data and Cloudera ganged up on me and tried to persuade me I was wrong. Be that as it may, that conversation and others were helpful to me in understanding the core thesis: Read more