Notes on the evolution of OLTP database management systems
The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part). OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache’, solidDB’s exit, etc.) generally accruing to products that originated in the 20th Century.
Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place. These include:
- Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value.
- Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- The era of silicon-centric relational DBMS is coming.
- The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.
- Users’ instance on “free” could be a major problem for OLTP DBMS innovation.
I shall explain.
Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- OLTP applications are more complex than analytic ones, and hence more tightly wired into particular brands of DBMS. For example, third-party packaged OLTP applications are typically portable among only a few brands of DBMS. But third-party business intelligence tools, and the BI “applications” built in them, are more easily and widely portable.
- Specific technical observations such as “OLTP apps tend to use stored procedures, which are DBMS-specific” or “OLTP apps tend to have lots and lots of tables” serve to underscore the first point.
- An enterprise’s highest-value data is commonly the financial stuff handled by its core OLTP systems, so those are the last things they want to mess around with just to get some cost savings. Security, high availability, and so on are major considerations that can outweigh cost.
By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value. Indeed, “OLTP” is often a misnomer, which is why I tend to go with “general-purpose” or some similarly wishy-washy phrase instead.
- In theory, this is a ripe area for what I’ve called mid-range DBMS.
- The big brand vendors try hard to keep as many of those databases for themselves as they can. Enterprise-wide license pricing helps. Going forward, so will virtualization/consolidation strategies, such as Oracle’s Exadata-centric approach.
- A variety of mid-range DBMS alternatives beyond the big brands have technical merit, at least in some cases and configurations – MySQL, PostgreSQL, Intersystems Cache’, and so on.
- The only such mid-range DBMS alternative with much large enterprise business momentum, however, appears to be MySQL.
“General-purpose” might be a better term than “OLTP” anyway.
- I don’t have a link, but it’s widely agreed that over half of the processing on an “OLTP” enterprise app is commonly reporting and so on.
- “Operational BI” is progressing by fits and starts, but it is progressing.
- Anything customer-facing — web-based, call center, or otherwise — is likely to include a heavy dose of “real-time” analytic optimization.
Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- VoltDB is the main exception that jumps to mind.
- This isn’t true in the analytic DBMS area, where Netezza, Greenplum, Aster, Vertica and others started from PostgreSQL’s code, APIs, or both.
It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- MySQL is a limited product without all the third-party storage engines that are being developed.
- Oracle’s promise of MySQL good behavior has an expiration date.
- None of the MySQL front-end alternatives are remotely mature yet.
The era of silicon-centric relational DBMS is coming.
- I think “silicon” means “solid-state memory” as much as or more than it means “RAM,” but that’s not yet certain.
- What is pretty certain is that, thanks to Moore’s Law, some kind of silicon will increasingly replace disk.
- Oracle’s increasingly Flash-centric story is a challenge to everybody.
- RAM-centric VoltDB will launch fairly soon. (By the way, while VoltDB still has a lot in common with H-Store, they’re not exactly the same thing. And H-Store research is progressing too.)
- RethinkDB is being developed, focused directly on solid-state memory. Based on the sparse information available online, RethinkDB sounds somewhat like a dumbed-down H-Store.
- New disk-based vendors may never optimize their use of disk, instead targeting a solid-state future. (E.g., I think Akiban should and quite well might follow this path.)
The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds. We hear that from the NoSQL guys all the time. But I also just heard it from Akiban.
Users’ instance on “free” could be a major problem for OLTP DBMS innovation. Vendors of new OLTP data management technologies often feel obligated to open source their products, notwithstanding the historical lack of revenue in the open source OLTP DBMS market. As just one of many examples, Nova Spivack wrote:
I have recently seen some new graph data storage products that may provide the levels of scale and performance needed, but pricing has not been determined yet. In short, storage and retrieval of semantic graph datasets is a big unsolved challenge that is holding back the entire industry. We need federated database systems that can handle hundreds of billions to trillions of triples under high load conditions, in the cloud, on commodity hardware and open source software. Only then will it be affordable to make semantic applications and services at Web-scale.
I hear similar things from other startups, who evidently believe they need and/or are entitled to enjoy sophisticated, high-performance, zero-cost, specialized database management technology.
Comments
8 Responses to “Notes on the evolution of OLTP database management systems”
Leave a Reply
Many DBMS products are free to use but they aren’t free to operate. Lots of people make a living keeping them running and the open-source products frequently require a bit more care than the closed- source ones. It would be interesting to read estimates on the point at which the open-source products have better price/performance. How big must a deployment be before the license cost from closed-source is much larger than the extra people cost from open-source?
VoltDB and TokuDB (http://www.tokutek.com) are two of my favorites in the OLTP space. I wish they would write more about their products. Thanks for the link to the new H-Store paper.
Nova’s comment likely refers to a new theoretical architecture for relational algebra implementations that is uniquely suited for massively parallelizing graph analytic databases. It will be difficult for an open source implementation to emerge because the core algorithms are buried in IP. There are at least two companies developing products based on it, one of which is a major database vendor.
The middle ground for cases like this is to go straight to a cloud offering for cost-sensitive customers. Not ideal but workable and it allows the vendor to stay in control of their product. It is very difficult to make money open sourcing a product when core development is being funded entirely by the company.
Andrew,
Can you say more about that, online or off?
CAM
Mark,
You are very right to focus on TCO, as opposed to specific types of cost.
CAM
Hi Mark,
Thanks for the kudos. BTW–we’ve opened up the VoltDB beta program and are providing more info about it.
For those who want to try it or read the product documentation, just request access to the early release web site at http://www.voltdb.com.
Curt, I would be happy to say more offline, just drop me an email.
Curt, i defintely agree that solid state memory is the future direction for analytical DBMS’s. I have seen the slide ware on the Teradata Extreme Performance Appliance and if reality is even close then this technology is a game changer. There are questions on longevity and availability but since SSD’s have no moving parts i would expect them to be more reliable than HD’s. Pretty exciting stuff.
[…] notes on all this from April, 2010 are already badly outdated, but may be interesting anyway. Categories: Akiban, dbShards and […]