Progress, Apama, and DataDirect
Analysis of Progress Software and its various product lines, including Apama, DataDirect, and OpenEdge. Related subjects include:
- CEP (Complex Event Processing)
- Mid-range OLTP and general-purpose database management systems
- (in Text Technologies) Progress’ EasyAsk natural language recognition technology
Opportunities for disruption in the OLTP database management market (deck-clearing post #2)
The standard Clayton Christensen “Innovator’s Dilemma” disruption narrative goes something like this:
- Market leaders have many advantages, including top technology.
- Followers come up with good technology too.
- The leaders stay ahead by making their products ever better and more complex.
- The followers sell into new or non-mainstream markets, at prices the leaders can’t match. So they dominate new markets.
- Old markets turn into low-margin commodity-fests.
- Old leaders are screwed.
And it’s really hard for market leaders to avert this sad fate, because the short- and intermediate-term margin hit would be too great.
I think the OLTP DBMS market is ripe for that kind of disruption – riper than commentators generally realize. Here are some key potential drivers:
Read more
OLTP database management system market – the consensus isn’t ALL wrong (deck-clearing post #1)
Most of what I’ve written lately about database management seems to have been focused on analytic technologies. But I have a lot to say on the OLTP (OnLine Transaction Processing) side too. So let’s start by clearing the decks. Here’s a list of some consensus views that I in essence agree with:
- Oracle is the top of the line, and has nothing wrong with it other than cost of ownership and the non-joys of doing business with Oracle Corporation.
- DB2/mainframe is a fine product, but only if you like IBM mainframes.
- DB2/open systems is another fine product, but it’s hard to think of reasons to use it over Oracle.
- Microsoft SQL Server has great cost of ownership if you’re a Windows (server) shop anyway, especially on the administrative side. It does most but not all of what Oracle does.
- Sybase Adaptive Server Enterprise is a lot like SQL Server, but without the Windows dependence or the great Microsoft tools. If you have it installed or are Chinese, you should strongly consider using it, but otherwise there are better alternatives.
- Progress’ DBMS is great if you don’t need any of the features it’s missing. Administration, for example, is a super-low-cost breeze. But why use it unless you’re also using the Progress development tools?
- Intersystems’ Cache’ is another fine mid-range product that involves buying into the vendors’ whole tool set – all the more so because it isn’t relational.
- Small-footprint embedded DBMS, from vendors such as Sybase’s iAnywhere division or Solid Information Technologies, are off in their own little world. Mainly, that world is telecom, with a satellite in medical devices, although other kinds of networked equipment also sometimes use these products.
- IBM’s non-DB2 database management products – IMS, Informix, etc. – are fine things to stick with until you have to change. Ditto products from Software AG, Computer Associates, Cincom, etc.
- MySQL Version 4 is an OLTP joke, but it’s a joke many people share. (Hey — a lot of blogs, including mine, run on WordPress and MySQL 4.)
- Until Ingres is meaningfully marketed and sold outside its installed base, it’s not worth worrying about.
- PostgreSQL is more significant as the underpinning of other products — mainly EnterpriseDB in the OLTP space — than it is in its own right.
White paper on memory-centric data management — excerpt
Here’s an excerpt from the introduction to my new white paper on memory-centric data management. I don’t know why WordPress insists on showing the table gridlines, but I won’t try to fix that now. Anyhow, if you’re interested enough to read most of this excerpt, I strongly suggest downloading the full paper.
|
Introduction
|
Conventional DBMS don’t always perform adequately. |
Ideally, IT managers would never need to think about the details of data management technology. Market-leading, general-purpose DBMS (DataBase Management Systems) would do a great job of meeting all information management needs. But we don’t live in an ideal world. Even after decades of great technical advances, conventional DBMS still can’t give your users all the information they need, when and where they need it, at acceptable cost. As a result, specialty data management products continue to be needed, filling the gaps where more general DBMS don’t do an adequate job.
|
Memory-centric technology is a powerful alternative. |
One category on the upswing is memory-centric data management technology. While conventional DBMS are designed to get data on and off disk quickly, memory-centric products (which may or may not be full DBMS) assume all the data is in RAM in the first place. The implications of this design choice can be profound. RAM access speeds are up to 1,000,000 times faster than random reads on disk. Consequently, whole new classes of data access methods can be used when the disk speed bottleneck is ignored. Sequential access is much faster in RAM, too, allowing yet another group of efficient data access approaches to be implemented.
|
It does things disk-based systems can’t. |
If you want to query a used-book database a million times a minute, that’s hard to do in a standard relational DBMS. But Progress’ ObjectStore gets it done for Amazon. If you want to recalculate a set of OLAP (OnLine Analytic Processing) cubes in real-time, don’t look to a disk-based system of any kind. But Applix’s TM1 can do just that. And if you want to stick DBMS instances on 99 nodes of a telecom network, all persisting data to a 100th node, a disk-centric system isn’t your best choice – but Solid’s BoostEngine should get the job done.
|
Memory-centric data managers fill the gap, in various guises. |
Those products are some leading examples of a diverse group of specialist memory-centric data management products. Such products can be optimized for OLAP or OLTP (OnLine Transaction Processing) or event-stream processing. They may be positioned as DBMS, quasi-DBMS, BI (Business Intelligence) features, or some utterly new kind of middleware. They may come from top-tier software vendors or from the rawest of startups. But they all share a common design philosophy: Optimize the use of ever-faster semiconductors, rather than focusing on (relatively) slow-spinning disks.
|
They have a rich variety of benefits. |
For any technology that radically improves price/performance (or any other measure of IT efficiency), the benefits can be found in three main categories:
For memory-centric data management, the “things that you couldn’t do before at all” are concentrated in areas that are highly real-time or that use non-relational data structures. Conversely, for many relational and/or OLTP apps, memory-centric technology is essentially a much cheaper/better/faster way of doing what you were already struggling through all along.
|
Memory-centric technology has many applications. |
Through both OEM and direct purchases, many enterprises have already adopted memory-centric technology. For example: |
|
|
Categories: Data types, Memory-centric data management, MOLAP, Object, OLTP, Open source, Progress, Apama, and DataDirect | 3 Comments |
Memory-centric data management whitepaper
I have finally finished and uploaded the long-awaited white paper on memory-centric data management.
This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:
- Applix, vendors of in-memory/memory-centric MOLAP tool TM1
- Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
- SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
- Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
- Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).
If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.
And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.
Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB | 2 Comments |
Progress DataDirect discovers XML
As a general rule, if you want DBMS drivers, your first call should be to Progress DataDirect. They’ve been the dominant vendor (under multiple names and ownerships) of both ODBC and JDBC drivers, essentially since those standards’ respective inventions. (Persistent Systems Private Ltd. — better known as PSPL — wouldn’t be a terrible choice for your second call).
DataDirect seems to have introduced XQuery drivers last fall. I don’t have a lot of detail on those, however, because the DataDirect guy who contacted me did so mainly to show off a nice toy, Stylus Studio. StylusS tudio is an XML query-building toolkit, available for online purchase for $800 or less. A lot of the users seem to be system integrators. Sales are split 50-50 between the DataDirect regular salesforce and online, apparently mainly from their own store, but I got the sense we’re not talking about huge numbers yet.
In usability when they demoed it to me it looked on a par with Cognos Improptu (a SQL query-building tool) circa the mid-1990s. But they do claim all the right things in round-trip code generation and so on.
Applications seem to be concentrated in intercompany information exchange, based on both legacy EDI (Electronic Data Interchange) and more modern web services. Other uses they cited were parsing web server logs and publishing relational data to a web page.
The technology/product seems to have bounced around for a while, from Object Design (OODBMS pioneer that took a premature shot at the XML database business, and the source of the ObjectStore technology I keep writing about in this blog) to eXcelon (merger partner for ODS, eventually bought by Progress), to Progress’s Sonic Software Division, and now to DataDirect after Progress bought them. Apparently none of those companies have or had top-end UI expertise …
If you want to get a better feel for XQuery, you could do worse than to play with this tool. For example, it’s what I think I’ll use in the unlikely case I ever get around to parsing the SpamAssassin add-ins to my email messages and trying to understand what SpamAssassin is and isn’t doing.
Categories: Progress, Apama, and DataDirect | Leave a Comment |
Another OLTP success for memory-centric OO
Computerworld published a Progress ObjectStore OLTP success story.
Hotel reservations system, this time. Not as impressive as the Amazon store — what is? — but still nice.
Categories: Cache, Memory-centric data management, Object, OLTP, Progress, Apama, and DataDirect, Theory and architecture | 5 Comments |
Defining and surveying “Memory-centric data management”
I’m writing more and more about memory-centric data management technology these days, including in my latest Computerworld column. You may be wondering what that term refers to. Well, I’ve basically renamed what are commonly called “in-memory DBMS,” for what I think is a very good reason: Most of the products in the category aren’t true DBMS, aren’t wholly in-memory, or both! Indeed, if you catch me in a grouchy mood I might argue that “in-memory DBMS” is actually a contradiction in terms.
I’ll give a quick summary of the vendors and products I am focusing on in this newly-named category, and it should be clearer what I mean:
- TimesTen (now owned by Oracle): TimesTen is the quintessentional “in-memory DBMS.” It’s a fairly full relational DBMS, but if you want to persist memory to disk it has to be handed off to a conventional DBMS. Historically, that has usually been MySQL or Oracle. TimesTen’s biggest market penetration has been in financial trading.
- Solid Information Technology‘s BoostEngine: Solid is a Finnish company (or was — it’s pretty American now) specializing in embedded DBMS sold mainly for telecommunication uses. Big OEM customers include several well-known telecom equipment manufacturers and HP (for OpenView). “Embedded” often means no DBA, no monitor, no keyboard — they box manufacturer installs it and there it stays for the life of the product. Solid has to offer strong replication capabilities, since its products are often used in highly distributed (e.g., multiblade, multibox) environments. So it’s taken the next step and exploited the replication by allowing customers to use some instances of the product disklessly.
- Event-stream products from Streambase and Progress: The canonical application for event-stream products is automating financial trading decisions based on the flow of market information. Mike Stonebraker, the brains behind Streambase, has recently popularized the idea; Progress bought Apama, who actually have been in the business longer. These applications require even more speed than the financial trading apps that TimesTen handles, and they discard most of the information they look at. In-memory is the only way to go.
- Progress’s ObjectStore: ObjectStore comes from the company Object Design, which merged into Excelon, which was acquired by Progress. It’s really a toolkit for building DBMS and similar systems, which is why it’s at various times been marketed as an OODBMS and an XML DBMS, without a lot of success either way. But there have been a few sterling apps built in ObjectStore even so, including a key part of the Amazon bookstore Despite this limited market success, a significant fraction of Progress’s best engineering talent has moved over to the Real-Time Division to focus on ObjectStore and other memory-centric products. The memory-centric aspect of ObjectStore is this: ObjectStore’s big virtue is that it gets objects from disk to memory and vice-versa very efficiently, then distributes and caches them around a network as needed. This was originally invented for client/server processing, but works fine in a multi-server thin client setup as well. And object processing, of course, relies on a whole lot of pointers. And pointer-chasing is pretty much the worst way to deal with the disk speed barrier, unless you do it in main memory.
- Applix‘s TM1: Like many companies in the analytics area, Applix has had trouble deciding whether it sells applications, BI system software, or both. But in any case its core technology is TM1, a memory-centric MOLAP offering. Traditional MOLAP products reside on the horns of a nasty dilemma: They rely on precalculation to give good performance, but that causes ghastly database explosion. Applix gets out of this problem by doing no precalculation whatsoever, loading the data into main memory, and executing all queries on the fly.
- SAP’s BI Accelerator: SAP is building out an elaborate technology stack with NetWeaver, especially in the BI area. One important aspect is that the full data warehouse is logically broken (or copied) into a series of data marts called “InfoCubes.” BI Accelerator takes the logical next step, loading an entire InfoCube into main memory. Almost every query is executed via a full table scan, which would be insane on disk but makes perfect sense when the data is already in RAM.
So there you have it. There are a whole lot of technologies out there that manage data in RAM, in ways that would make little or no sense if disks were more intimately involved. Conventional DBMS also try to exploit RAM and limit disk access, via caching; but generally the data access methods they use in RAM are pretty similar to those they use when going out to disk. So memory-centric systems can have a major advantage.
The Amazon.com bookstore is a huge, modern OLTP app. So is it relational?
I don’t know for a fact that the Amazon.com bookstore is the world’s biggest OLTP application — but if it isn’t, it’s close.
And the thing is — that’s never been an entirely relational application. Oh, the ordering part surely is. But the inventory lookup is currently driven by an OODBMS (from Progress). The personalization used to be done in Red Brick (I knew which software replaced it, but I’m forgetting at the moment — it may even be one of the relational warehouse appliance vendors). And of course the full-text search is a custom in-house system.
Categories: Amazon and its cloud, Cache, Data types, Database diversity, Memory-centric data management, Object, OLTP, Progress, Apama, and DataDirect, Specific users | 4 Comments |
The end of the single-server DBMS vendor
For all practical purposes, there are no DBMS vendors left advocating single-server strategies. Oracle was the last one, but it just acquired in-memory data management vendor TimesTen, which will be used as a cache in front of high-performance Oracle databases. (It will also continue to be sold for stand-alone uses, especially in the financial trading and defense/intelligence markets.)
IBM’s Viper is a server-and-a-half story, with lots of integration over a dual-server (one relational, one native XML) base. IBM also is moving aggressively in data integration/federation, with Ascential and many other acquisitions. It also sells a broad range of database products itself, including two DB2s, several Informix products, and so on.
Microsoft also has a multi-server strategy. In its case, relational, text, and MOLAP storage are more separate than in Oracle’s or even IBM’s products; again, there’s a thick layer of technology on top integrating them. An eventual move to native XML storage will, one must imagine, be handled in the same way.
Smaller vendors Sybase and Progress also offer multiple DBMS each.
Teradata is a pretty big player with only one DBMS — but it’s specialized for data warehousing. Teradata is the first to tell you you should use something else for your classical transaction processing.
The Grand Unified Integrated Database theory is, so far as I can tell, quite dead. Some people just refuse to admit that fact.