solidDB
Analysis of hybrid memory-centric DBMS solidDB and its developer Solid Technology. Related subjects include:
- IBM, which acquired Solid Technology and hence solidDB
- Memory-centric data management
Memory-centric data management whitepaper
I have finally finished and uploaded the long-awaited white paper on memory-centric data management.
This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:
- Applix, vendors of in-memory/memory-centric MOLAP tool TM1
- Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
- SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
- Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
- Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).
If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.
And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.
Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB | 2 Comments |
Solid/MySQL fit and positioning
I felt like writing a lot about the great potential fit between MySQL and Solid over the weekend, but Solid didn’t want me to do so. Now, however, I’m not in the mood, so I’ll just say that in OLTP, Solid’s technology is strong where MySQL’s is weak, and vice-versa. E.g., Solid is so proud of its zero-administration capabilities that, without MySQL, it doesn’t have much in the way of admin tools at all. Conversely, I think that many of those websites that crash all the time with MySQL errors would crash less with the Solid engine underneath. (Solid happens to be proud of its BLOB-handling capability, efficiency-wise.)
Neither outfit is good in data warehousing, or in text search, image search, etc. (Solid slings big files around, but it doesn’t peer closely inside them). But for OLTP of tabular or dumb media data, this looks like a great fit.
Whether anybody will care, however, is a different matter.
Lisa Vaas of eWeek offers a survey of the many MySQL engine options.
EDIT: Another Lisa Vaas article makes it clear that MySQL is planning to compete in data warehousing/OLAP as well.
Categories: Memory-centric data management, Mid-range, MySQL, OLTP, Open source, solidDB | 4 Comments |
More on Solid and MySQL?
In a stunningly self-defeating move, my friends at Solid have decided that anything about their already-leaked possible cooperation with MySQL is embargoed.
Indeed, they’ve emphasized to me multiple times that they do not wish me to write about it.
I shall honor their wishes. I hope they are pleased with the sophistication and insight of the coverage they receive from other sources.
Categories: Memory-centric data management, Mid-range, MySQL, OLTP, Open source, solidDB | 3 Comments |
MySQL gets the Solid engine
Solid and MySQL have struck a deal (and for some odd reason I had to find out about it from Slashdot and then here rather than from one one the companies). Apparently Solid will open source a version of its storage engine, to be used with the MySQL front-end.
Solid’s core technology is a lightweight, zero-administration OLTP RDBMS. And they really mean “zero-administration,” because as they like to point out, a typical deployment is embedded in a piece of telecom equipment that doesn’t even have a keyboard. Now, that doesn’t really mean the Solid engine would still be zero-administration in other applications, but sure aren’t talking about something as prickly as, say, Oracle.
That said, Solid’s technology has its limitations. It isn’t historically designed for the query load (volume or mix) of, say, an SAP installation. It certainly doesn’t have much in the way of data warehousing functionality. And it doesn’t have much in the way of administration tools itself (although presumably MySQL will fill that gap).
One very important aspect of the Solid technology is its hybrid memory-centric design. Much more on that soon. My white paper on memory-centric data management is finally close to publication, with Solid as a co-sponsor. At some point I’ll even do a webinar for them associated with the paper.
I don’t know whether that’s part of the MySQL relationship — it would be very cool if it were.
Categories: Memory-centric data management, Mid-range, MySQL, OLTP, Open source, solidDB | 2 Comments |
Defining and surveying “Memory-centric data management”
I’m writing more and more about memory-centric data management technology these days, including in my latest Computerworld column. You may be wondering what that term refers to. Well, I’ve basically renamed what are commonly called “in-memory DBMS,” for what I think is a very good reason: Most of the products in the category aren’t true DBMS, aren’t wholly in-memory, or both! Indeed, if you catch me in a grouchy mood I might argue that “in-memory DBMS” is actually a contradiction in terms.
I’ll give a quick summary of the vendors and products I am focusing on in this newly-named category, and it should be clearer what I mean:
- TimesTen (now owned by Oracle): TimesTen is the quintessentional “in-memory DBMS.” It’s a fairly full relational DBMS, but if you want to persist memory to disk it has to be handed off to a conventional DBMS. Historically, that has usually been MySQL or Oracle. TimesTen’s biggest market penetration has been in financial trading.
- Solid Information Technology‘s BoostEngine: Solid is a Finnish company (or was — it’s pretty American now) specializing in embedded DBMS sold mainly for telecommunication uses. Big OEM customers include several well-known telecom equipment manufacturers and HP (for OpenView). “Embedded” often means no DBA, no monitor, no keyboard — they box manufacturer installs it and there it stays for the life of the product. Solid has to offer strong replication capabilities, since its products are often used in highly distributed (e.g., multiblade, multibox) environments. So it’s taken the next step and exploited the replication by allowing customers to use some instances of the product disklessly.
- Event-stream products from Streambase and Progress: The canonical application for event-stream products is automating financial trading decisions based on the flow of market information. Mike Stonebraker, the brains behind Streambase, has recently popularized the idea; Progress bought Apama, who actually have been in the business longer. These applications require even more speed than the financial trading apps that TimesTen handles, and they discard most of the information they look at. In-memory is the only way to go.
- Progress’s ObjectStore: ObjectStore comes from the company Object Design, which merged into Excelon, which was acquired by Progress. It’s really a toolkit for building DBMS and similar systems, which is why it’s at various times been marketed as an OODBMS and an XML DBMS, without a lot of success either way. But there have been a few sterling apps built in ObjectStore even so, including a key part of the Amazon bookstore Despite this limited market success, a significant fraction of Progress’s best engineering talent has moved over to the Real-Time Division to focus on ObjectStore and other memory-centric products. The memory-centric aspect of ObjectStore is this: ObjectStore’s big virtue is that it gets objects from disk to memory and vice-versa very efficiently, then distributes and caches them around a network as needed. This was originally invented for client/server processing, but works fine in a multi-server thin client setup as well. And object processing, of course, relies on a whole lot of pointers. And pointer-chasing is pretty much the worst way to deal with the disk speed barrier, unless you do it in main memory.
- Applix‘s TM1: Like many companies in the analytics area, Applix has had trouble deciding whether it sells applications, BI system software, or both. But in any case its core technology is TM1, a memory-centric MOLAP offering. Traditional MOLAP products reside on the horns of a nasty dilemma: They rely on precalculation to give good performance, but that causes ghastly database explosion. Applix gets out of this problem by doing no precalculation whatsoever, loading the data into main memory, and executing all queries on the fly.
- SAP’s BI Accelerator: SAP is building out an elaborate technology stack with NetWeaver, especially in the BI area. One important aspect is that the full data warehouse is logically broken (or copied) into a series of data marts called “InfoCubes.” BI Accelerator takes the logical next step, loading an entire InfoCube into main memory. Almost every query is executed via a full table scan, which would be insane on disk but makes perfect sense when the data is already in RAM.
So there you have it. There are a whole lot of technologies out there that manage data in RAM, in ways that would make little or no sense if disks were more intimately involved. Conventional DBMS also try to exploit RAM and limit disk access, via caching; but generally the data access methods they use in RAM are pretty similar to those they use when going out to disk. So memory-centric systems can have a major advantage.