Data warehouse appliances

Analysis of data warehouse appliances – i.e., of hardware/software bundles optimized for fast query and analysis of large volumes of (usually) relational data. Related subjects include:

Data warehousing
Parallelization
Netezza
DATAllegro
Teradata
Kickfire
(in The Monash Report) Computing appliances in multiple domains

June 24, 2011

Forthcoming Oracle appliances

Edit: I checked with Oracle, and it’s indeed TimesTen that’s supposed to be the basis of this new appliance, as per a comment below. That would be less cool, alas.

Oracle seems to have said on yesterday’s conference call Oracle OpenWorld (first week in October) will feature appliances based on Tangosol and Hadoop. As I post this, the Seeking Alpha transcript of Oracle’s call is riddled with typos. Bolded comments below are by me. Read more

Categories: Data warehouse appliances, Hadoop, In-memory DBMS, MapReduce, Memory-centric data management, Object, Oracle

8 Comments

June 2, 2011

Why you would want an appliance — and when you wouldn’t

Data warehouse appliances are booming. But Hadoop appliances are a non-starter.

Data warehouse and other data management appliances are on the upswing. Oracle is pushing Exadata. Teradata* is going strong, and also recently bought Aster Data. IBM bought Netezza. Greenplum and Vertica were bought by EMC and HP respectively. All those moves are favorable for appliances.

*As far as I’m concerned, all Teradata hardware-included systems are appliances.

In essence, there are two kinds of reasons to prefer appliances over software-only offerings: Read more

Categories: Data warehouse appliances, Hadoop, Open source

10 Comments

May 14, 2011

Alternatives for Hadoop/MapReduce data storage and management

There’s been a flurry of announcements recently in the Hadoop world. Much of it has been concentrated on Hadoop data storage and management. This is understandable, since HDFS (Hadoop Distributed File System) is quite a young (i.e. immature) system, with much strengthening and Bottleneck Whack-A-Mole remaining in its future.

Known HDFS and Hadoop data storage and management issues include but are not limited to:

Hadoop is run by a master node, and specifically a namenode, that’s a single point of failure.
HDFS compression could be better.
HDFS likes to store three copies of everything, whereas many DBMS and file systems are satisfied with two.
Hive (the canonical way to do SQL joins and so on in Hadoop) is slow.

Different entities have different ideas about how such deficiencies should be addressed. Read more

Categories: Aster Data, Cassandra, Cloudera, Data warehouse appliances, DataStax, EMC, Greenplum, Hadapt, Hadoop, IBM and DB2, MapReduce, MongoDB, Netezza, Parallelization

22 Comments

May 3, 2011

Oracle and Exadata: Business and technical notes

Last Friday I stopped by Oracle for my first conversation since January, 2010, in this case for a chat with Andy Mendelsohn, Mark Townsend, Tim Shetler, and George Lumpkin, covering Exadata and the Oracle DBMS. Key points included: Read more

Categories: Analytic technologies, Cache, Clustering, Data warehouse appliances, Data warehousing, Emulation, transparency, portability, Exadata, MapReduce, Market share and customer counts, OLTP, Oracle, Parallelization, Predictive modeling and advanced analytics, Solid-state memory

9 Comments

April 21, 2011

In-memory, parallel, not-in-database SAS HPA does make sense after all

I talked with SAS about its new approach to parallel modeling. The two key points are:

SAS no longer plans to go as far with in-database modeling as it previously intended.
Rather, SAS plans to run in RAM on MPP DBMS appliances, exploiting MPI (Message Passing Interface).

The whole thing is called SAS HPA (High-Performance Analytics), in an obvious reference to HPC (High-Performance Computing). It will run initially on RAM-heavy appliances from Teradata and EMC Greenplum.

A lot of what’s going on here is that SAS found it annoyingly difficult to parallelize modeling within the framework of a massively parallel DBMS such as Teradata. Notes on that aspect include:

SAS wasn’t exploiting the capabilities of individual DBMS to their fullest; rather, it was looking for an approach that would work across multiple brands of DBMS. Thus, for example, the fact that Aster’s analytic platform architecture is more flexible or powerful than Teradata’s didn’t help much with making SAS run within the Aster nCluster database.
Notwithstanding everything else, SAS did make a certain set of modeling procedures run in-database.
SAS’ previous plans to run in-database modeling in Aster and/or Netezza DBMS may never come to fruition.

Categories: Aster Data, Data warehouse appliances, Data warehousing, EMC, Greenplum, Memory-centric data management, Netezza, Parallelization, Predictive modeling and advanced analytics, SAS Institute, Teradata, Workload management

7 Comments

April 17, 2011

Netezza TwinFin i-Class overview

I have long complained about difficulties in discussing Netezza’s TwinFin i-Class analytic platform. But I’m ready now, and in the grand sweep of the product’s history I’m not even all that late. The Netezza i-Class timing story goes something like this:

Netezza i-Class was first foreshadowed in February, 2010.
Netezza i-Class customer testing started in October, 2010 or so. Netezza i-Class evidently has been shipped to 4-5 partners and a single-digit number of end-user organizations, spread across some usual-suspect industries (financial services, telecom, and so on).
Netezza i-Class 1.0 general availability is still in the (near) future.

My advice to Netezza as to how it should describe TwinFin i-Class boils down to: Read more

Categories: Cloudera, Data warehouse appliances, Data warehousing, GIS and geospatial, Hadoop, IBM and DB2, MapReduce, Netezza, Parallelization, Predictive modeling and advanced analytics

5 Comments

April 16, 2011

Unpacking the EMC Greenplum Q1 sales disaster rumors

A well-connected tipster believes:

EMC Greenplum’s* revenue target for Q1 had been $35 million.
Actual EMC Greenplum revenue for Q1 was $3 million, or maybe it was $8 million.
EMC Greenplum had 75 sales teams trying to generate this revenue.

In the past I might have called Greenplum for clarification, but they’re not knocking themselves out to inform me these days, nor to inspire me with confidence in what they say. Read more

Categories: Data warehouse appliances, EMC, Greenplum

3 Comments

April 10, 2011

Teradata integrates in solid-state storage

For once, I think Teradata’s annual hardware refresh is pretty interesting, because of the integration of flash storage into its high-end “active enterprise data warehouse” product line. The essence of the announcement is:

Teradata is rolling out a new appliance,* the 6680, which combines hard-disk and solid-state drives, relying on Teradata Virtual Storage.
Teradata is also rolling out a hard-disk-based appliance,* the 6650, in a more routine annual refresh.

Categories: Data warehouse appliances, Pricing, Solid-state memory, Teradata

3 Comments

April 5, 2011

Comments on EMC Greenplum

I am annoyed with my former friends at Greenplum, who took umbrage at a brief sentence I wrote in October, namely “eBay has thrown out Greenplum“. Their reaction included:

EMC Greenplum no longer uses my services.
EMC Greenplum no longer briefs me.
EMC Greenplum reneged on a commitment to fund an effort in the area of privacy.

The last one really hurt, because in trusting them, I put in quite a bit of effort, and discussed their promise with quite a few other people.

8 Comments

March 4, 2011

Teradata, Aster Data, and Teradata/Aster

Teradata is acquiring Aster Data. Naturally, the deal is being presented with a Treaty of Tordesillas kind of positioning — Teradata does X, Aster Data does Y, and everybody looks forward to having X and Y in the same product portfolio. That said, my initial positioning and product strategy thoughts on the Teradata/Aster combination go something like this. Read more

Categories: Analytic technologies, Aster Data, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, RDF and graphs, Specific users, Teradata

9 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in