Open source
Discussion of relational database management systems that are offered through some version of open source licensing. Related subjects include:
Open source in-memory DBMS
I’ve gotten email about two different open source in-memory DBMS products/projects. I don’t know much about either, but in case you care, here are some pointers to more info.
First, the McObject guys — who also sell a relational in-memory product — have an object-oriented, apparently Java-centric product called Perst. They’ve sent over various press releases about same, the details of which didn’t make much of an impression on me. (Upon review, I see that one of the main improvements they cite in Perst 3.0 is that they added 38 pages of documentation.)
Second, I just got email about something called CSQL Cache. You can read more about CSQL Cache here, if you’re willing to navigate some fractured English. CSQL’s SourceForge page is here. My impression is that CSQL Cache is an in-memory DBMS focused on, you guessed it, caching. It definitely seems to talk SQL, but possibly its native data model is of some other kind (there are references both to “file-based” and “network”.)
Categories: Cache, DBMS product categories, In-memory DBMS, McObject, Memory-centric data management, Object, OLTP, Open source | 5 Comments |
EnterpriseDB survey on open source database adoption (participation time)
CTO Bob Zurek of EnterpriseDB asked me to pass along a link to a short survey on open source database adoption. He plans to release the results publicly after they are collected. Bob stressed to me that he used to be a Forrester analyst, his point being that he knows how to be analytically objective.
Looking over the 15 questions (14 of which are simple multiple-choice), he lived up quite well to the “unbiased” claim. E.g., the only Postgres option cited is PostgreSQL, rather than EnterpriseDB’s proprietary/value-added packagings. I do see one little screw-up: Several of the questions are worded as if the respondent is, enterprise-wide, running one and exactly one instance of open source DBMS. But otherwise it seems like a clean, tight, simple survey.
Categories: EnterpriseDB and Postgres Plus, Open source | 1 Comment |
Database blades are not what they used to be
In which we bring you another instantiation of Monash’s First Law of Commercial Semantics: Bad jargon drives out good.
When Enterprise DB announced a partnership with Truviso for a “blade,” I naturally assumed they were using the term in a more-or-less standard way, and hence believed that it was more than a “Barney” press release.* Silly me. Rather than referring to something closely akin to “datablade,” EnterpriseDB’s “blade” program turns out to just to be a catchall set of partnerships.
*A “Barney” announcement is one whose entire content boils down to “I love you; you love me.”
According to EnterpriseDB CTO Bob Zurek, the main features of the “blade” program include: Read more
Categories: Data types, Emulation, transparency, portability, EnterpriseDB and Postgres Plus, Open source, PostgreSQL | 5 Comments |
Truviso and EnterpriseDB blend event processing with ordinary database management
Truviso and EnterpriseDB announced today that there’s a Truviso “blade” for Postgres Plus. By email, EnterpriseDB Bob Zurek endorsed my tentative summary of what this means technically, namely:
There’s data being managed transactionally by EnterpriseDB.
Truviso’s DML has all along included ways to talk to a persistent Postgres data store.
If, in addition, one wants to do stream processing things on the same data, that’s now possible, using Truviso’s usual DML.
Kickfire kicks off
I chatted with Raj Cherabuddi and others on the Kickfire (formerly C2) team for over an hour on Monday, and now have a better sense of their story. There are some very basic questions I still don’t have answers to; I’ll fill those in when I can.
Highlights of what I have and haven’t figured out so far include:
-
Kickfire’s technology has two main parts: A SQL co-processor chip and a MySQL storage engine.
-
Kickfire makes a Type 0 appliance. If I understood correctly, it contains the chip, a couple of standard CPU cores, and 64 gigs of RAM. Or else it contains just the chip, and is meant to be hooked up to a 2U box with 64 gigs of RAM. I’m confused.
-
The Kickfire box can handle up to 3 terabytes of user data. The disk required for that is 4-5 terabytes without redundancy, 2X with. Based on that formulation and other clues, I’m guessing Kickfire — unlike other appliance vendors — doesn’t build in storage itself.
-
I don’t know whether the Kickfire chip is true custom silicon or an FPGA emulation.
-
The essential idea of the chip is dataflow programming for SQL, with pipelining between operations. This eliminates the overhead of registers and context switching. I don’t know what the trade-offs are, if any.
-
Kickfire’s database software is columnar, operating on compressed data even in RAM. In that, Kickfire’s story is most similar to Vertica’s, although I’m guessing Exasol may do something similar as well. Like Vertica, Kickfire uses multiple compression methods (they’re reluctant to give detail, but agreed it would be fair to say they use both something like dictionary/token and something like delta compression).
-
Kickfire’s software is ACID-compliant. You can do incremental loads or trickle feeds. Bulk load speed is 100 Gb/hour. Kickfire’s solution for the traditional problem of updating column stores is called “snapshots.” Without giving details, they position that as similar to the Vertica solution.
-
Like other MySQL storage engines, Kickfire inherits whatever data connectivity, stored procedure capabilities, user-defined functions ability, etc. that MySQL has.
-
Kickfire has no paying customers, but does have a slide showing many logos of “prospects and beta customers.”
-
Kickfire has no MPP capabilities at this time, but says adding those is “on the roadmap” and will be “easy.”
-
Kickfire submitted a 100 Gb TPC-H result, in which it beat the previous leaders — Exasol, ParAccel, and Microsoft – on price-performance, and lagged only Exasol and ParAccel on absolute performance. Kickfire is extremely proud of this. Indeed, I don’t recall another vendor ascribing that much weight to them in the entire history of TPCs.* Kickfire seems unfazed by the fact that its result is for a system listed with a ship date 6 months in the future (I’m guessing that’s the latest the TPC will allow), while the other results are for systems available today.
*Somebody – perhaps adman extraordinaire Rick Bennett? — may want to check my memory on this, but I think Oracle’s famed “Gentlemen, start your snails” ad in the early 1990s was about PC World tests, not TPCs. Oracle also had an ad about WW1-style planes nosediving, but I don’t think those referenced TPCs either.
ScaleDB presents The Revenge of the Pointer
The MySQL user conference is upon us, and hence so are MySQL-related product announcements, including storage engines. One such is Kickfire. ScaleDB — smaller and earlier-stage — is another.
In a nutshell, ScaleDB’s proposition is:
-
Innovative approach to indexing relational DBMS, providing performance advantages.
-
Shared-everything scale-up that ScaleDB believes will leapfrog the MySQL engine competition already in Release 1. (In my opinion, this is the least plausible part of the ScaleDB story.)
-
State-of-the-art me-too facilities for locking, logging, replication/fail-over, etc., also already in Release 1.
Like many software companies with non-US roots, ScaleDB seems to have started with a single custom project, using a Patricia trie indexing system. Then they decided Patricia tries might be really useful for relational OLTP as well. The ScaleDB team now features four developers, plus half-time or so “Chief Architect” involvement from Vern Watts. Watts seems to pretty much have been Mr. IMS for the past four decades, and thus surely knows a whole lot about pointer-based database management systems; presumably, he’s responsible for the generic DBMS design features that are being added to the innovative indexing scheme. On ScaleDB’s advisory board is PeopleSoft veteran Rick Berquist, about whom I’ve had fond thoughts ever since he talked me into focusing on consulting as the core of my business.*
*More precisely, Rick pretty much tricked me into doing a day of consulting for $15K, then revealed that’s what he’d done, expressing the thought that he’d very much gotten his money’s worth. But I digress …
ScaleDB has no customers to date, but hopes to be in beta by the end of this year. Angels and a small VC firm have provided bridge loans; otherwise, ScaleDB has no outside investment. ScaleDB’s business model thoughts include: Read more
Categories: Data models and architecture, Mid-range, MySQL, OLTP, Open source, ScaleDB, Theory and architecture | 5 Comments |
Supporting evidence for the DBMS disruption story
As previously announced, I did a webcast this afternoon, discussing database diversity. The title of the talk was taken directly from a post – What leading DBMS vendors don’t want you to realize — that argued mid-range DBMS are suitable for a broad variety of tasks. The overriding theme was a Clayton Christensen-style “disruption” narrative.
The sponsor was EnterpriseDB, which is fitting. While not the biggest DBMS industry disrupter in terms of revenue or visible impact (MySQL and Netezza say “Hi”), the Postgres family in general and EnterpriseDB in particular epitomize the disruption threat like nobody else, because of how broadly they substitute for market-leading database managers.
As I promised on the call, below is a post with links to further research backing up the points made. They’re numbered to match some of the presentation slides, which you can find at this link.
3. Much of the discussion of database diversity comes from a series of posts I coordinated with Mike Stonebraker.
4. At various times, starting on Slide 4, I made reference to datatype extensibility, a key feature of Oracle and DB2 – and a key advantage of Postgres over MySQL.
10. Capping off the database diversity discussion, Slide 10 mirrors this 11-point version of a data management software taxonomy.
13-14. I’ve posted many times about data warehousing DBMS and related technologies, including this overview of major analytic DBMS products, another recent overview of data warehouse specialty technologies, and an attempt to distinguish between data warehouse appliance myths and realities. Of particular interest for further research may be our sections on data warehouse appliances and columnar DBMS.
15. I do most of my posting about text search over on Text Technologies, specifically in the search category. Vendors I specifically mentioned as blending search with other kinds of data retrieval were Mark Logic and Attivio.
16. There’s a section here on native XML database management.
17. We also have a section on managing RDF and other graphical data models.
18. Ditto complex event/stream processing.
19. The only embeddable DBMS I’ve written much about recently is solidDB. And frankly, even in that case I’ve focused more on mid-tier caching uses, the now-canceled MySQL relationship, or general technology than I did specifically on embedded uses.
22-24. Back in February, 2007 I made what is probably still my clearest post explaining why I think market-leading DBMS vendors are in the process of getting disrupted
Categories: EnterpriseDB and Postgres Plus, Mid-range, MySQL, Open source, Oracle, PostgreSQL | Leave a Comment |
Disruption versus chasm crossing in the database market
The 451 Group just released a report on open source DBMS adoption. In a blog post announcing same, Matthew Aslett wrote (emphasis mine):
you only have to look at the comparative revenues of the open source and proprietary vendors to see that there is a vast chasm to be crossed.
“Chasm” memes were introduced by Geoffrey Moore, founder of the Chasm Group and author of Crossing the Chasm. His defining example was Oracle, and the database market in general. The core insight was that platform markets get to tipping points, after which the leaders have tremendous advantages that make them tend to remain leaders for a good long time.
The sequel to “chasm” theory is Clayton Christensen’s “disruption” rubric, popularized in The Innovator’s Dilemma. I’ve argued previously that the DBMS market is being disrupted, in both the ways that Christensen records: Read more
Categories: Data warehouse appliances, Open source | 1 Comment |
GridSQL: What EnterpriseDB is and is not doing in Postgres-based MPP data warehousing
While talking with EnterpriseDB about today’s Postgres Plus announcements, I took the chance to clear up a point of confusion. Somebody told Seth Grimes that EnterpriseDB is out to compete with Greenplum, but that person was wrong. EnterpriseDB fondly hopes to manage multi-terabyte data warehouses, just as Oracle and Microsoft do with their respective general-purpose DBMS. However, EnterpriseDB is not going after the 10s-100s of terabytes sized DBMS that are the province of specialists such as Greenplum, Teradata, Netezza, or columnar DBMS vendors.
Even so, in GridSQL EnterpriseDB does seem to be open-sourcing MPP shared-nothing basics. There’s a lightweight optimizer that does a little (but only a little) more to minimize data movement beyond just optimizing queries on each node. And GridSQL knows how to replicate small tables across each node, a key aspect of many MPP designs. (Partition your facts; replicate your dimensions.)
Categories: Analytic technologies, Data warehousing, EnterpriseDB and Postgres Plus, Greenplum, Open source, Parallelization | 1 Comment |
EnterpriseDB unveils Postgres Plus
EnterpriseDB is making a series of moves and announcements. Highlights include:
- Renaming/repositioning the product as “Postgres Plus.” The free product is now Postgres Plus, while the version you pay EnterpriseDB for is now Postgres Plus Advanced Server.
- Repackaging the products, so that Postgres Plus Advanced Server is a strict superset of Postgres Plus.
- New features added to Postgres Plus Advanced Server.
- Features newly migrated from Advanced Server down to Postgres Plus.
- A strategic investment by IBM.
- Stressing Postgres in EnterpriseDB marketing, and dropping the tag-line defining themselves as “the Oracle-compatible database company.”
So far as I can tell, most of the technical differences between Advanced Server and regular Postgres Plus lie in three areas: Read more