Open source

Discussion of relational database management systems that are offered through some version of open source licensing. Related subjects include:

August 12, 2006

Introduction to Greenplum and some compare/contrast

Netezza relies on FPGAs. DATallegro essentially uses standard components, but those include Infiniband cards (and there’s a little FPGA action when they do encryption). Greenplum, however, claims to offer a highly competitive data warehouse solution that’s so software-only you can download it from their web site. That said, their main sales mode seems to also be through appliances, specifically ones branded and sold by Sun, combining Greenplum and open source software on a “Thumper” box. And the whole thing supposedly scales even higher than DATallegro and Netezza, because you can manage over a petabyte if you chain together a dozen of the 100 terabyte racks.
Read more

Categories: Actian and Ingres, Data warehouse appliances, DATAllegro, Greenplum, Netezza, Open source, PostgreSQL

4 Comments

July 25, 2006

Solid’s MySQL engine

Solid Information Technology is making the beta of its MySQL engine available for download midday on Tuesday. So I talked with them today, mercifully unembargoed. Here’s the story.

Categories: Memory-centric data management, Mid-range, MySQL, OLTP, Open source

4 Comments

July 24, 2006

Firebird, nee Interbase

Apparently, Interbase has morphed into Firebird. Interbase was an early RDBMS, owned by Borland, occasionally touted as the next great DBMS contender, and early to be open-sourced. That’s about as much as I remember about it. There were a couple of features on which it was earlier than the big boys — BLOBs, maybe? — but I imagine that’s very old news by now. And indeed the product doesn’t seem to be terribly up to date at this point.

So are there any Firebird partisans out there who’d like to tell me what’s so great about Firebird? Thanks in advance, and I’m especially grateful for the flame-free nature of your expected contribution.

Categories: Open source

5 Comments

July 12, 2006

Ingres’s questionable target market

Eric Lai of Computerworld interviewed Roger Burkhardt, new CEO of Ingres, and obviously did a bang-up job of asking him the tough “Who really are your target customers, and why would they buy from you?” questions. The answer, so far as I can tell, is “Large financial institutions writing new RDBMS apps that don’t need up-to-date functionality and don’t want to pay Oracle’s license fees.” Up to a point, that makes sense. Except for the “financial institutions” qualifier, it’s actually pretty obvious. I can’t imagine why any other new users would buy Ingres, which has been ever the bridesmaid, never the bride for the past 20 years.
Read more

Categories: Actian and Ingres, Open source

1 Comment

July 3, 2006

DATallegro’s technical strategy

Few areas of technology boast more architectural diversity than data warehousing. Mainframe DB2 is different from Teradata, which is different from the leading full-spectrum RDBMS, which are different from disk-based appliances, which are different from memory-centric solutions, which are different from disk-based MOLAP systems, and so on. What’s more, no two members of the same group are architected the same way; even the market-leading general purpose DBMS have important differences in their data warehousing features.

The hot new vendor on the block is DATallegro, which is stealing much of the limelight formerly enjoyed by data warehouse appliance pioneer Netezza. (After some good early discussions, Netezza abruptly reneged on a promise a year ago to explain more about its technology workings to me, and I’ve hardly heard from them since. Yes, they’re still much bigger than DATallegro, but I suspect they’ve hit some technical roadblocks, and their star is fading.)

Categories: Actian and Ingres, Data warehouse appliances, DATAllegro, Open source

17 Comments

May 22, 2006

Data warehouse appliances

If we define a “data warehouse appliance” as “a special-purpose computer system, with appliance administratibility, that manages a data warehouse,” then there are two major contenders: Netezza and DATAllegro, both startups, both with a small number of disclosed customers. Past contenders would include Teradata and White Cross (which seems to have just merged into Kognitio), but neither would admit to being in that market today. (I suspect this is a mistake on Teradata’s part, but so be it.) IBM with DB2 on the z-Series wouldn’t be properly regarded as an appliance player either, although IBM is certainly conscious of appliance competition. And SAP’s BI Accelerator does not persist data at this time.

In principle, the Netezza and DATAllegro stories are similar — take an established open source RDBMS*, build optimized hardware to run it, and optimize the software configuration as well. Much of the optimization is focused on getting data on and off disk sequentially, minimizing any random accesses. This is why I often refer to data warehouse appliances as being the best alternative to memory-centric data management. Beyond that, the optimizations by the two vendors differ considerably.
*Netezza uses PostgreSQL; DATAllegro uses Ingres.

Hmm. I don’t feel like writing more on this subject at this very moment, yet I want to post something urgently because there’s an IOU in my Computerworld column today for it. OK. More later.

Categories: Actian and Ingres, Companies and products, Data warehouse appliances, DATAllegro, DBMS product categories, IBM and DB2, Memory-centric data management, Open source, SAP AG

White paper on memory-centric data management — excerpt

Here’s an excerpt from the introduction to my new white paper on memory-centric data management. I don’t know why WordPress insists on showing the table gridlines, but I won’t try to fix that now. Anyhow, if you’re interested enough to read most of this excerpt, I strongly suggest downloading the full paper.

	Introduction
Conventional DBMS don’t always perform adequately.	Ideally, IT managers would never need to think about the details of data management technology. Market-leading, general-purpose DBMS (DataBase Management Systems) would do a great job of meeting all information management needs. But we don’t live in an ideal world. Even after decades of great technical advances, conventional DBMS still can’t give your users all the information they need, when and where they need it, at acceptable cost. As a result, specialty data management products continue to be needed, filling the gaps where more general DBMS don’t do an adequate job.
Memory-centric technology is a powerful alternative.	One category on the upswing is memory-centric data management technology. While conventional DBMS are designed to get data on and off disk quickly, memory-centric products (which may or may not be full DBMS) assume all the data is in RAM in the first place. The implications of this design choice can be profound. RAM access speeds are up to 1,000,000 times faster than random reads on disk. Consequently, whole new classes of data access methods can be used when the disk speed bottleneck is ignored. Sequential access is much faster in RAM, too, allowing yet another group of efficient data access approaches to be implemented.
It does things disk-based systems can’t.	If you want to query a used-book database a million times a minute, that’s hard to do in a standard relational DBMS. But Progress’ ObjectStore gets it done for Amazon. If you want to recalculate a set of OLAP (OnLine Analytic Processing) cubes in real-time, don’t look to a disk-based system of any kind. But Applix’s TM1 can do just that. And if you want to stick DBMS instances on 99 nodes of a telecom network, all persisting data to a 100^th node, a disk-centric system isn’t your best choice – but Solid’s BoostEngine should get the job done.
Memory-centric data managers fill the gap, in various guises.	Those products are some leading examples of a diverse group of specialist memory-centric data management products. Such products can be optimized for OLAP or OLTP (OnLine Transaction Processing) or event-stream processing. They may be positioned as DBMS, quasi-DBMS, BI (Business Intelligence) features, or some utterly new kind of middleware. They may come from top-tier software vendors or from the rawest of startups. But they all share a common design philosophy: Optimize the use of ever-faster semiconductors, rather than focusing on (relatively) slow-spinning disks.
They have a rich variety of benefits.	For any technology that radically improves price/performance (or any other measure of IT efficiency), the benefits can be found in three main categories: Doing the same things you did before, only more cheaply; Doing the same things you did before, only better and/or faster; Doing things that weren’t technically or economically feasible before at all. For memory-centric data management, the “things that you couldn’t do before at all” are concentrated in areas that are highly real-time or that use non-relational data structures. Conversely, for many relational and/or OLTP apps, memory-centric technology is essentially a much cheaper/better/faster way of doing what you were already struggling through all along.
Memory-centric technology has many applications.	Through both OEM and direct purchases, many enterprises have already adopted memory-centric technology. For example:
	Financial services vendors use memory-centric data management throughout their trading systems. Telecom service vendors use memory-centric data management in multiple provisioning, billing, and routing applications. Memory-centric data management is used to accelerate web transactions, including in what may be the most demanding OLTP app of all — Amazon.com’s online bookstore. Memory-centric data management technology is OEMed in a variety of major enterprise network management products, including HP Openview. Memory-centric data management is used to accelerate analytics across a broad variety of industries, especially in such areas as planning, scenarios, customer analytics, and profitability analysis.

Categories: Data types, Memory-centric data management, MOLAP, Object, OLTP, Open source, Progress, Apama, and DataDirect

3 Comments

May 8, 2006

Memory-centric data management whitepaper

I have finally finished and uploaded the long-awaited white paper on memory-centric data management.

This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:

Applix, vendors of in-memory/memory-centric MOLAP tool TM1
Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).

If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.

And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.

Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB

2 Comments

April 26, 2006

Solid/MySQL fit and positioning

I felt like writing a lot about the great potential fit between MySQL and Solid over the weekend, but Solid didn’t want me to do so. Now, however, I’m not in the mood, so I’ll just say that in OLTP, Solid’s technology is strong where MySQL’s is weak, and vice-versa. E.g., Solid is so proud of its zero-administration capabilities that, without MySQL, it doesn’t have much in the way of admin tools at all. Conversely, I think that many of those websites that crash all the time with MySQL errors would crash less with the Solid engine underneath. (Solid happens to be proud of its BLOB-handling capability, efficiency-wise.)

Neither outfit is good in data warehousing, or in text search, image search, etc. (Solid slings big files around, but it doesn’t peer closely inside them). But for OLTP of tabular or dumb media data, this looks like a great fit.

Whether anybody will care, however, is a different matter.

Lisa Vaas of eWeek offers a survey of the many MySQL engine options.

EDIT: Another Lisa Vaas article makes it clear that MySQL is planning to compete in data warehousing/OLAP as well.

Categories: Memory-centric data management, Mid-range, MySQL, OLTP, Open source, solidDB

4 Comments

April 22, 2006

Open source

Introduction to Greenplum and some compare/contrast

Solid’s MySQL engine

Firebird, nee Interbase

Ingres’s questionable target market

DATallegro’s technical strategy

Data warehouse appliances

White paper on memory-centric data management — excerpt

Memory-centric data management whitepaper

Solid/MySQL fit and positioning

More on Solid and MySQL?

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin