Storage

Analysis of storage technologies, especially in the context of database management. Related subjects include:

Data warehousing

August 12, 2010

Teradata’s future product strategy

I think Teradata’s future product strategy is coming into focus. I’ll start by outlining some particular aspects, and then show how I think it all ties together.
Read more

Categories: Business intelligence, Data warehouse appliances, Data warehousing, Kickfire, MicroStrategy, Solid-state memory, Storage, Teradata

5 Comments

July 31, 2010

Teradata, Xkoto Gridscale (RIP), and active-active clustering

Having gotten a number of questions about Teradata’s acquisition of Xkoto, I leaned on Teradata for an update, and eventually connected with Scott Gnau. Takeaways included:

Teradata is discontinuing Xkoto’s existing product Gridscale, which Scott characterized as being too OLTP-focused to be a good fit for Teradata. Teradata hopes and expects that existing Xkoto Gridscale customers won’t renew maintenance. (I’m not sure that they’ll even get the option to do so.)
The point of Teradata’s technology + engineers acquisition of Xkoto is to enhance Teradata’s active-active or multi-active data warehousing capabilities, which it has had in some form for several years.
In particular, Teradata wants to tie together different products in the Teradata product line. (Note: Those typically all run pretty much the same Teradata database management software, except insofar as they might be on different releases.)
Scott rattled off all the plausible areas of enhancement, with multiple phrasings – performance, manageability, ease of use, tools, features, etc.
Teradata plans to have one or two releases based on Xkoto technology in 2011.

Frankly, I’m disappointed at the struggles of clustering efforts such as Xkoto Gridscale or Continuent’s pre-Tungsten products, but if the DBMS vendors meet the same needs themselves, that’s OK too.

The logic behind active-active database implementations actually seems pretty compelling: Read more

Categories: Clustering, Continuent, Data warehousing, Solid-state memory, Teradata, Theory and architecture, Xkoto

9 Comments

July 7, 2010

Why analytic DBMS increasingly need to be storage-aware

In my quick reactions to the EMC/Greenplum announcement, I opined

I think that even software-only analytic DBMS vendors should design their systems in an increasingly storage-aware manner

promising to explain what I meant later on. So here goes. Read more

Categories: Data warehouse appliances, Data warehousing, Solid-state memory, Storage, Theory and architecture

6 Comments

July 6, 2010

EMC is buying Greenplum

EMC is buying Greenplum. Most of the press release is a general recapitulation of Greenplum’s marketing messages, the main exceptions being (emphasis mine):

The acquisition of Greenplum will be an all-cash transaction and is expected to be completed in the third quarter of 2010, subject to customary closing conditions and regulatory approvals. The acquisition is not expected to have a material impact to EMC GAAP and non-GAAP EPS for the full 2010 fiscal year. Upon close, Bill Cook will lead the new data computing product division and report to Pat Gelsinger. EMC will continue to offer Greenplum’s full product portfolio to customers and plans to deliver new EMC Proven reference architectures as well as an integrated hardware and software offering designed to improve performance and drive down implementation costs.

Greenplum is one of my biggest vendor clients, and EMC is just becoming one, but of course neither side gave me a heads-up before the deal happened, nor have I yet been briefed subsequently. With those disclaimers out of the way, some of my early thoughts include:

I wish my clients would never buy each other, but it’s inevitable.
I don’t think anybody evaluating Greenplum should be much influenced by this deal one way or the other. (Whether they will be is of course a different matter.)
- EMC tends to run its bigger software acquisitions in a fairly hands-off manner. There’s no particular FUD (Fear/Uncertainty/Doubt) reason why this deal should stop anybody from buying Greenplum software.
- I also don’t think adding a rich parent adds much of a reason to buy from Greenplum. But if you’re the type who’s nervous about smaller vendors — well, Greenplum now isn’t so small.
Greenplum Chorus could, in principle, work with non-Greenplum DBMS. That possibility suddenly looks a lot more realistic.
The list of analytic DBMS vendors with an appliance orientation is pretty impressive, including:
- Oracle, with Exadata
- Microsoft, partially
- Teradata
- Netezza
- Now EMC/Greenplum, at least partially
- Weaker players such as:
  - The ailing Kickfire, which a client (not Kickfire itself) tells me is being shopped around
  - The reeling HP Neoview
  - XtremeData, but I’m still waiting to hear of XtremeData’s first real sale
Greenplum is something of a specialist in large databases. EMC has to love that.
Greenplum’s weakness is concurrency.
Greenplum’s “polymorphic storage” is a good fit for a storage vendor with appliance-y ideas.
And finally — I think that even software-only analytic DBMS vendors should design their systems in an increasingly storage-aware manner, and have been advising my vendor clients of same. I’ll blog that line of reasoning separately when I get a chance, and edit in a link here after I do.

Related links (edit)

Here’s the promised post as to why analytic DBMS need to be ever more storage-aware.
Dave Kellogg crunched the EMC/Greenplum numbers, coming up with an estimated valuation range of $3-400 million, the high end of which is rumored to be correct.
Merv Adrian suggests the big EMC/Greenplum loser is ParAccel, a viewpoint which presumably presupposes that the EMC/ParAccel partnership was significant in the first place.
I talked with Ben Werther and posted more about Greenplum and EMC.

Categories: Data warehouse appliances, EMC, Greenplum, Storage

13 Comments

June 25, 2010

Flash is coming, well …

I really, really wanted to title this post “Flash is coming in a flash.” That seems a little exaggerated — but only a little.

Netezza now intends to come out with a flash-based appliance earlier than it originally expected.
Indeed, Netezza has suspended — by which I mean “scrapped” — prior plans for a RAM-heavy disk-based appliance. It will use a RAM/flash combo instead.*
Tim Vincent of IBM told me that customers seem ready to adopt solid-state memory. One interesting comment he made is that Flash isn’t really all that much more expensive than high-end storage area networks.

Uptake of solid-state memory (i.e. flash) for analytic database processing will probably stay pretty low in 2010, but in 2011 it should be a notable (b)leading-edge technology, and it should get mainstreamed pretty quickly after that. Read more

Categories: Data integration and middleware, Data warehousing, IBM and DB2, Memory-centric data management, Netezza, Solid-state memory, Theory and architecture

4 Comments

May 25, 2010

VoltDB finally launches

VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB — or just “Volt” — has discovered the virtues of embargoes that end 12:01 am. Let’s go straight to the technical highlights:

VoltDB is based on the H-Store technology, which I wrote about in February, 2009. Most of what I said about H-Store then applies to VoltDB today.
VoltDB is a no-apologies ACID relational DBMS, which runs entirely in RAM.
VoltDB has rather limited SQL. (One example: VoltDB can’t do SUMs in SQL.) However, VoltDB guy Tim Callaghan (Mark Callaghan’s lesser-known but nonetheless smart brother) asserts that if you code up the missing functionality, it’s almost as fast as if it were present in the DBMS to begin with, because there’s no added I/O from the handoff between the DBMS and the procedural code. (The data’s in RAM one way or the other.)
VoltDB’s Big Conceptual Performance Story is that it does away with most locks, latches, logs, etc., and also most context switching.
In particular, you’re supposed to partition your data and architect your application so that most transactions execute on a single core. When you can do that, you get VoltDB’s performance benefits. To the extent you can’t, you’re in two-phase-commit performance land. (More precisely, you’re doing 2PC for multi-core writes, which is surely a major reason that multi-core reads are a lot faster in VoltDB than multi-core writes.)
VoltDB has a little less than one DBMS thread per core. When the data partitioning works as it should, you execute a complete transaction in that single thread. Poof. No context switching.
A transaction in VoltDB is a Java stored procedure. (The early idea of Ruby on Rails in lieu of the Java/SQL combo didn’t hold up performance-wise.)
Solid-state memory is not a viable alternative to RAM for VoltDB. Too slow.
Instead, VoltDB lets you snapshot data to disk at tunable intervals. “Continuous” is one of the options, wherein a new snapshot starts being made as soon as the last one completes.
In addition, VoltDB will also spool a kind of transaction log to the target of your choice. (Obvious choice: An analytic DBMS such as Vertica, but there’s no such connectivity partnership actually in place at this time.)

Categories: EAI, EII, ETL, ELT, ETLT, Games and virtual worlds, In-memory DBMS, Investment research and trading, Michael Stonebraker, NoSQL, OLTP, Parallelization, Solid-state memory, Telecommunications, Theory and architecture, VoltDB and H-Store

16 Comments

May 12, 2010

The Clustrix story

After my recent post, the Clustrix guys raised their hands and briefed me. Takeaways included: Read more

Categories: Application areas, Clustrix, Emulation, transparency, portability, Games and virtual worlds, MySQL, NoSQL, OLTP, Parallelization, Solid-state memory

8 Comments

May 8, 2010

Revisiting disk vibration as a data warehouse performance problem

Last April, I wrote about the problems disk vibration can cause for data warehouse performance. Possible performance hits exceeded 10X, wild as that sounds.

Now Slashdot and ZDnet have weighed in, although for the most part they only are suggesting 50-100% performance hits. Read more

Categories: Data warehousing, Storage

3 Comments

May 4, 2010

Clustrix may be doing something interesting

Clustrix launched without briefing me or, at least so far as I can tell, anybody else who knows much about database technology. But Clustrix did post a somewhat crunchy, no-registration-required, white paper. Based on that, I get the impression:

Clustrix is making OLTP DBMS.
The core problem Clustrix tries to solve is scale-out, without necessarily giving up SQL. (I couldn’t immediately tell whether Clustrix supports NoSQL-style key-value interfaces enthusiastically, grudgingly, or not at all.)
Unlike Akiban or VoltDB, Clustrix makes database appliances. The Clustrix software seems to assume a Clustrix appliance.
A key feature of Clustrix’s database appliances is that they rely on solid-state memory. I’m guessing that Clustrix appliances don’t even have disks, or that if they do the disks store some software or something, not actual data. (As previously noted, I agree with Oracle in thinking that much of the progress in database technology this decade will come from proper design for solid-state memory.)
Clustrix talks of things that sound like compiled queries and attempts to avoid locks. However, it doesn’t sound as extreme in these regards as VoltDB.
Clustrix also talks of things that sound like consistent hashing.
The brand name “Sierra” also shows up along with the brand name “Clustrix.”

Categories: Clustrix, Data warehouse appliances, DBMS product categories, NoSQL, Parallelization, Solid-state memory, Storage, Theory and architecture

2 Comments

April 7, 2010

Thoughts on IBM’s anti-Oracle announcements

IBM is putting out a couple of press releases today that are obviously directed competitively at Oracle/Sun, and more specifically at Oracle’s Exadata-centric strategy. I haven’t been briefed, so I just have those to go on.

On the whole, the releases look pretty lame. Highlights seem to include:

Maybe a claim of enhanced data compression.
Otherwise, no obvious new technology except product packaging and bundling.
Aggressive plans to throw capital at the Sun channel to convert it to selling IBM gear. (A figure of $1/2 billion is mentioned, for financing.

Disappointingly, IBM shows a lot of confusion between:

Text data
Machine-generated data such as that from sensors

While both highly important, those are very different things. IBM has not in the past shown much impressive technology in either of those two areas, and based on these releases, I presume that trend is continuing.

Edits:

I see from press coverage that at least one new IBM model has some Fusion I/O solid-state memory boards in it. Makes sense.

A Twitter hashtag has a number of observations from the event. Not much substance I could detect except various kind of Oracle bashing.

Categories: Database compression, Exadata, IBM and DB2, Oracle, Solid-state memory

14 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in