Solid-state memory
Discussion of how developments in solid-state memory will affect database management. Related subjects include:
- Data warehouse appliances
- (in The Monash Report) The future of diskless PCs
Teradata’s future product strategy
I think Teradata’s future product strategy is coming into focus. I’ll start by outlining some particular aspects, and then show how I think it all ties together.
Read more
Categories: Business intelligence, Data warehouse appliances, Data warehousing, Kickfire, MicroStrategy, Solid-state memory, Storage, Teradata | 5 Comments |
Teradata, Xkoto Gridscale (RIP), and active-active clustering
Having gotten a number of questions about Teradata’s acquisition of Xkoto, I leaned on Teradata for an update, and eventually connected with Scott Gnau. Takeaways included:
- Teradata is discontinuing Xkoto’s existing product Gridscale, which Scott characterized as being too OLTP-focused to be a good fit for Teradata. Teradata hopes and expects that existing Xkoto Gridscale customers won’t renew maintenance. (I’m not sure that they’ll even get the option to do so.)
- The point of Teradata’s technology + engineers acquisition of Xkoto is to enhance Teradata’s active-active or multi-active data warehousing capabilities, which it has had in some form for several years.
- In particular, Teradata wants to tie together different products in the Teradata product line. (Note: Those typically all run pretty much the same Teradata database management software, except insofar as they might be on different releases.)
- Scott rattled off all the plausible areas of enhancement, with multiple phrasings – performance, manageability, ease of use, tools, features, etc.
- Teradata plans to have one or two releases based on Xkoto technology in 2011.
Frankly, I’m disappointed at the struggles of clustering efforts such as Xkoto Gridscale or Continuent’s pre-Tungsten products, but if the DBMS vendors meet the same needs themselves, that’s OK too.
The logic behind active-active database implementations actually seems pretty compelling: Read more
Categories: Clustering, Continuent, Data warehousing, Solid-state memory, Teradata, Theory and architecture, Xkoto | 9 Comments |
Why analytic DBMS increasingly need to be storage-aware
In my quick reactions to the EMC/Greenplum announcement, I opined
I think that even software-only analytic DBMS vendors should design their systems in an increasingly storage-aware manner
promising to explain what I meant later on. So here goes. Read more
Categories: Data warehouse appliances, Data warehousing, Solid-state memory, Storage, Theory and architecture | 6 Comments |
Flash is coming, well …
I really, really wanted to title this post “Flash is coming in a flash.” That seems a little exaggerated — but only a little.
- Netezza now intends to come out with a flash-based appliance earlier than it originally expected.
- Indeed, Netezza has suspended — by which I mean “scrapped” — prior plans for a RAM-heavy disk-based appliance. It will use a RAM/flash combo instead.*
- Tim Vincent of IBM told me that customers seem ready to adopt solid-state memory. One interesting comment he made is that Flash isn’t really all that much more expensive than high-end storage area networks.
Uptake of solid-state memory (i.e. flash) for analytic database processing will probably stay pretty low in 2010, but in 2011 it should be a notable (b)leading-edge technology, and it should get mainstreamed pretty quickly after that. Read more
Categories: Data integration and middleware, Data warehousing, IBM and DB2, Memory-centric data management, Netezza, Solid-state memory, Theory and architecture | 4 Comments |
VoltDB finally launches
VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB — or just “Volt” — has discovered the virtues of embargoes that end 12:01 am. Let’s go straight to the technical highlights:
- VoltDB is based on the H-Store technology, which I wrote about in February, 2009. Most of what I said about H-Store then applies to VoltDB today.
- VoltDB is a no-apologies ACID relational DBMS, which runs entirely in RAM.
- VoltDB has rather limited SQL. (One example: VoltDB can’t do SUMs in SQL.) However, VoltDB guy Tim Callaghan (Mark Callaghan’s lesser-known but nonetheless smart brother) asserts that if you code up the missing functionality, it’s almost as fast as if it were present in the DBMS to begin with, because there’s no added I/O from the handoff between the DBMS and the procedural code. (The data’s in RAM one way or the other.)
- VoltDB’s Big Conceptual Performance Story is that it does away with most locks, latches, logs, etc., and also most context switching.
- In particular, you’re supposed to partition your data and architect your application so that most transactions execute on a single core. When you can do that, you get VoltDB’s performance benefits. To the extent you can’t, you’re in two-phase-commit performance land. (More precisely, you’re doing 2PC for multi-core writes, which is surely a major reason that multi-core reads are a lot faster in VoltDB than multi-core writes.)
- VoltDB has a little less than one DBMS thread per core. When the data partitioning works as it should, you execute a complete transaction in that single thread. Poof. No context switching.
- A transaction in VoltDB is a Java stored procedure. (The early idea of Ruby on Rails in lieu of the Java/SQL combo didn’t hold up performance-wise.)
- Solid-state memory is not a viable alternative to RAM for VoltDB. Too slow.
- Instead, VoltDB lets you snapshot data to disk at tunable intervals. “Continuous” is one of the options, wherein a new snapshot starts being made as soon as the last one completes.
- In addition, VoltDB will also spool a kind of transaction log to the target of your choice. (Obvious choice: An analytic DBMS such as Vertica, but there’s no such connectivity partnership actually in place at this time.)
The Clustrix story
After my recent post, the Clustrix guys raised their hands and briefed me. Takeaways included: Read more
Categories: Application areas, Clustrix, Emulation, transparency, portability, Games and virtual worlds, MySQL, NoSQL, OLTP, Parallelization, Solid-state memory | 8 Comments |
Clustrix may be doing something interesting
Clustrix launched without briefing me or, at least so far as I can tell, anybody else who knows much about database technology. But Clustrix did post a somewhat crunchy, no-registration-required, white paper. Based on that, I get the impression:
- Clustrix is making OLTP DBMS.
- The core problem Clustrix tries to solve is scale-out, without necessarily giving up SQL. (I couldn’t immediately tell whether Clustrix supports NoSQL-style key-value interfaces enthusiastically, grudgingly, or not at all.)
- Unlike Akiban or VoltDB, Clustrix makes database appliances. The Clustrix software seems to assume a Clustrix appliance.
- A key feature of Clustrix’s database appliances is that they rely on solid-state memory. I’m guessing that Clustrix appliances don’t even have disks, or that if they do the disks store some software or something, not actual data. (As previously noted, I agree with Oracle in thinking that much of the progress in database technology this decade will come from proper design for solid-state memory.)
- Clustrix talks of things that sound like compiled queries and attempts to avoid locks. However, it doesn’t sound as extreme in these regards as VoltDB.
- Clustrix also talks of things that sound like consistent hashing.
- The brand name “Sierra” also shows up along with the brand name “Clustrix.”
Categories: Clustrix, Data warehouse appliances, DBMS product categories, NoSQL, Parallelization, Solid-state memory, Storage, Theory and architecture | 2 Comments |
Thoughts on IBM’s anti-Oracle announcements
IBM is putting out a couple of press releases today that are obviously directed competitively at Oracle/Sun, and more specifically at Oracle’s Exadata-centric strategy. I haven’t been briefed, so I just have those to go on.
On the whole, the releases look pretty lame. Highlights seem to include:
- Maybe a claim of enhanced data compression.
- Otherwise, no obvious new technology except product packaging and bundling.
- Aggressive plans to throw capital at the Sun channel to convert it to selling IBM gear. (A figure of $1/2 billion is mentioned, for financing.
Disappointingly, IBM shows a lot of confusion between:
- Text data
- Machine-generated data such as that from sensors
While both highly important, those are very different things. IBM has not in the past shown much impressive technology in either of those two areas, and based on these releases, I presume that trend is continuing.
Edits:
I see from press coverage that at least one new IBM model has some Fusion I/O solid-state memory boards in it. Makes sense.
A Twitter hashtag has a number of observations from the event. Not much substance I could detect except various kind of Oracle bashing.
Categories: Database compression, Exadata, IBM and DB2, Oracle, Solid-state memory | 14 Comments |
Notes on the evolution of OLTP database management systems
The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part). OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache’, solidDB’s exit, etc.) generally accruing to products that originated in the 20th Century.
Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place. These include:
- Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value.
- Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- The era of silicon-centric relational DBMS is coming.
- The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.
- Users’ instance on “free” could be a major problem for OLTP DBMS innovation.
I shall explain. Read more
Open issues in database and analytic technology
The last part of my New England Database Summit talk was on open issues in database and analytic technology. This was closely intertwined with the previous section, and also relied on a lot that I’ve posted here. So I’ll just put up a few notes on that part, with lots of linkage to prior discussion of the same points. Read more