Open source
Discussion of relational database management systems that are offered through some version of open source licensing. Related subjects include:
VectorWise, Ingres, and MonetDB
I talked with Peter Boncz and Marcin Zukowski of VectorWise last Wednesday, but didn’t get around to writing about VectorWise immediately. Since then, VectorWise and its partner Ingres have gotten considerable coverage, especially from an enthusiastic Daniel Abadi. Basic facts that you may already know include:
- VectorWise, the product, will be an open-source columnar analytic DBMS. (But that’s not quite true. Pending productization, it’s more accurate to call the VectorWise technology a row/column hybrid.)
- VectorWise is due to be introduced in 2010. (Peter Boncz said that to me more clearly than I’ve seen in other coverage.)
- VectorWise and Ingres have a deal in which Ingres will at least be the exclusive seller of the VectorWise technology, and hopefully will buy the whole company.
- Notwithstanding that it was once named something like “MonetDB,” VectorWise actually is not the same thing as MonetDB, another open source columnar analytic DBMS from the same research group.
- The MonetDB and VectorWise research groups consist in large part of academics in Holland, specifically at CWI (Centrum voor Wiskunde en Informatica). But Ingres has a research group working on the project too. (Right now there are about seven “highly experienced” people each on the VectorWise and Ingres sides, although at least the VectorWise folks aren’t all full-time. More are being added.)
- Ingres and VectorWise haven’t agreed exactly how VectorWise and Ingres Classic will play together in the Ingres product line. (All of the obvious possibilities are still on the table.)
- VectorWise is shared-everything, just as Ingres is. But plans — still tentative — are afoot to integrate VectorWise with MapReduce in Daniel Abadi’s HadoopDB project.
Categories: Actian and Ingres, Analytic technologies, Columnar database management, Data warehousing, Database compression, MonetDB, Open source, Theory and architecture, VectorWise | 12 Comments |
What are the best choices for scaling Postgres?
March, 2011 edit: In its quaintness, this post is a reminder of just how fast Short Request Processing DBMS technology has been moving ahead. If I had to do it all over again, I’d suggest they use one of the high-performance MySQL options like dbShards, Schooner, or both together. I actually don’t know what they finally decided on in that area. (I do know that for analytic DBMS they chose Vertica.)
I have a client who wants to build a new application with peak update volume of several million transactions per hour. (Their base business is data mart outsourcing, but now they’re building update-heavy technology as well. ) They have a small budget. They’ve been a MySQL shop in the past, but would prefer to contract (not eliminate) their use of MySQL rather than expand it.
My client actually signed a deal for EnterpriseDB’s Postgres Plus Advanced Server and GridSQL, but unwound the transaction quickly. (They say EnterpriseDB was very gracious about the reversal.) There seem to have been two main reasons for the flip-flop. First, it seems that EnterpriseDB’s version of Postgres isn’t up to PostgreSQL’s 8.4 feature set yet, although EnterpriseDB’s timetable for catching up might have tolerable. But GridSQL apparently is further behind yet, with no timetable for up-to-date PostgreSQL compatibility. That was the dealbreaker.
The current base-case plan is to use generic open source PostgreSQL, with scale-out achieved via hand sharding, Hibernate, or … ??? Experience and thoughts along those lines would be much appreciated.
Another option for OLTP performance and scale-out is of course memory-centric options such as VoltDB or the Groovy SQL Switch. But this client’s database is terabyte-scale, so hardware costs could be an issue, as of course could be product maturity.
By the way, a large fraction of these updates will be actual changes, as opposed to new records, in case that matters. I expect that the schema being updated will be very simple — i.e., clearly simpler than in a classic order entry scenario.
Apparent turmoil at EnterpriseDB
EnterpriseDB seems to be facing a string of management departures:
- Bob Zurek, EnterpriseDB’s well-regarded CTO, is gone. (He landed at Infobright, after a stint of independent consulting.)
- Multiple rumors have founder Andy Astor leaving EnterpriseDB, and stepping back to an advisory role. One version has Tuesday, June 16 as Andy’s last day. Update: As of Wednesday, June 17, Andy Astor is no longer listed as being on EnterpriseDB’s management team.
- Fred Holahan, who was briefly VP of Marketing, is not listed on EnterpriseDB’s management team web page. And EnterpriseDB announced a new VP of Marketing and Product Management on May 21.
- Other rumors point to turmoil at EnterpriseDB as well.
And by the way, EnterpriseDB, which used to call itself “the Oracle-compatible database company,” recently licensed out what used to be its core differentiating technology.
Now, this isn’t all bad news. EnterpriseDB’s Oracle-compatibility focus had to be changed anyway. And Fred Holahan was the proximate cause for me writing:
my recent dealings with EnterpriseDB underscore the importance of being VERY careful about counting your fingers after you shake hands with that company,
Still, these aren’t exactly indicators of a company executing on a smooth-running plan.
Categories: EnterpriseDB and Postgres Plus, Open source | 3 Comments |
Daniel Abadi on Kickfire and related subjects
Daniel Abadi has a new blog, whose first post centers around Kickfire. The money quote is (emphasis mine):
In order for me to get excited about Kickfire, I have to ignore Mike Stonebraker’s voice in my head telling me that DBMS hardware companies have been launched many times in the past are ALWAYS fail (the main reasoning is that Moore’s law allows for commodity hardware to catch up in performance, eventually making the proprietary hardware overpriced and irrelevant). But given that Moore’s law is transforming into increased parallelism rather than increased raw speed, maybe hardware DBMS companies can succeed now where they have failed in the past
Good point.
More generally, Abadi speculates about the market for MySQL-compatible data warehousing. My responses include:
- OF COURSE there are many MySQL users who need to move to a serious analytic DBMS.
- What’s less clear is whether there’s any big advantage to those users in remaining MySQL-compatible when they do move. I’m not sure what MySQL-specific syntax or optimizations they’d have that would be difficult to port to a non-MySQL system.
- It’s nice to see Abadi speaking well of Infobright and its technology.
- To say that Infobright went open source because it was “desperate” is overstated. That said, I don’t think Infobright was on track to prosper without going open source.
- While open source and MySQL go together, an appliance like Kickfire loses many (not all) of the benefits of open source.
- Calpont has indeed never disclosed a customer win. Any year now … (Just kidding, Vogel!)
- In general, seeing Abadi be so favorable toward Vertica competitors adds credibiity to the recent Hadoop vs. DBMS paper.
Anyhow, as previously noted, I’m a big Daniel Abadi fan. I look forward to seeing what else he posts in his blog, and am optimistic he’ll live up to or exceed its stated goals.
Categories: Calpont, Columnar database management, Data warehouse appliances, Data warehousing, DBMS product categories, Infobright, Kickfire, MySQL, Open source, Theory and architecture | 2 Comments |
Yet more on MySQL forks and storage engines
The issue of MySQL forks and their possible effect on closed-source storage engine vendors continues to get attention. The underlying question is:
Suppose Oracle wants to make life difficult for third-party storage engine vendors via its incipient control of MySQL? Can the storage engine vendors insulate themselves from this risk by working with a MySQL fork?
Categories: MySQL, Open source, PostgreSQL | 11 Comments |
MySQL forking heats up, but not yet to the benefit of non-GPLed storage engine vendors
Last month, I wrote “This is a REALLY good time to actively strengthen the MySQL forkers,” largely on behalf of closed-source/dual-source MySQL storage engine vendors such as Infobright, Kickfire, Calpont, Tokutek, or ScaleDB. Yesterday, two of my three candidates to lead the effort — namely Monty Widenius/MariaDB/Monty Program AB and Percona — came together to form something called the Open Database Alliance. Details may be found:
- On the Open Database Alliance website
- In a press release
- On Monty Widenius’ blog
- In a Stephen O’Grady blog post based on a discussion with Monty Widenius
- In an ars technica blog post based on a discussion with Monty Program AB’s Kurt von Finck
But there’s no joy for the non-GPLed MySQL storage engine vendors in the early news. Read more
Categories: MySQL, Open source, Theory and architecture | 16 Comments |
MySQL miscellany
For a guy who doesn’t go to the MySQL conference and routinely gets flamed by the MySQL community for being insufficiently adoring of their beloved product, I sure have been putting up a lot of MySQL-related posts recently. Here’s another, zooming through a few different topics. Read more
Categories: MySQL, Open source | 4 Comments |
I don’t see why the GPL would be a major barrier to a useful MySQL fork
I posted suggesting that substantial elements of the MySQL community should throw their weight behind MySQL forks. Mike Olson of Cloudera helpfully pointed out, on Twitter and by email, how the GPL could appear to stand in the way of such an effort. But would it really?
Currently, any version of the MySQL code that isn’t proprietary to the MySQL company — which is owned by Sun and hence expected to be owned soon by Oracle — is covered by GPL 2. That license states (emphasis mine):
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted,
Hence it is hard for me to see how the MySQL company could in any way hinder another software vendor from saying “Please buy my software, then go download a free copy of GPLed MySQL and run the two together.”*
Categories: MySQL, Open source | 14 Comments |
This week is a REALLY good time to actively strengthen the MySQL forkers
As my first three posts on the Oracle/Sun merger suggested, I think Oracle will do a better job with MySQL product development than Sun has. But of course that’s a low hurdle. And so it leaves open the questions:
What should and/or will be the most widely adopted code lines of MySQL (or other open source DBMS),
especially for the types of users and vendors who are engaged with MySQL (as opposed to principal alternative PostgreSQL) today?
As much as I’ve bashed MySQL/MyISAM and MySQL/InnoDB for being low-quality general-purpose DBMS, I’d still hate to see MySQL-based development stall out. There are a number of MySQL engine providers with rather unique technology, that deserve a good front-end partner to build their products with. The high-volume sharding guys deserve the chance to continue down their current path as well. And so does the low-end mass market — although I’m least worried about them, as I can’t imagine any realistic scenario in which Oracle doesn’t offer a version of MySQL fully suited to support 10s of millions of WordPress and Joomla installations.
So far as I can tell, there are only four real and currently active candidates for MySQL code coordinator:
- MySQL itself, soon to be owned by Oracle.
- MariaDB, Monty Widenius’ proposed mainstream MySQL alternative
- Percona, which seems to have some fans as a superior alternative to vendor-supplied MySQL/InnoDB
- Drizzle, which is directly focused at web-centric MySQL users who never wanted a robust DBMS in the first place.
Patrick Galbraith and Steven Vaughan-Nichols did good jobs of illustrating the turmoil.
Oracle isn’t a very comfortable partner long term for the storage engine vendors, and Drizzle doesn’t seem to be what they need. So I think that Infobright, Kickfire, Tokutek, Calpont, et al. need to get aligned in a hurry with an outside MySQL provider such as Percona or MariaDB or a newcomer, preferably all with the same one. Yes, I understand that Infobright is getting a lot of marketing help from Sun these days, that Kickfire just got a nice-sounding Sun marketing announcement as well, and so on. But the time to start working toward the inevitable future is now.
And by “now” I mean “right now,” since the MySQL community is at this moment gathered together for its annual conference.
Categories: Infobright, Kickfire, MySQL, Open source | 12 Comments |
MySQL storage engine round-up, with Oracle-related thoughts
Here’s what I know about MySQL storage engines, more or less.
- MySQL with MyISAM is fast. But it’s not transactional. Except for limited purposes, MySQL with MyISAM is a pretty crummy DBMS. Nothing can change that.
- MySQL with InnoDB is transactional. But it’s not particularly fast. MySQL with InnoDB is a pretty mediocre DBMS. Oracle could fix that, at least partially, over time.
- I don’t know much about Falcon, Maria, and so on. With Oracle winding up owning both MySQL and InnoDB, the motivation for those engines (except as Oracle-free forks) might fade.
- Infobright is the most established of the rest. At the moment I’m not recommending it for most industrial-strength uses unless the user is particularly cash-constrained. But I wouldn’t be surprised if that changed soon. A cheap, fast, simple columnar analytic DBMS has a place in the world.
- Kickfire is next in line, offering a hardware-based growth path for users who’ve maxed out on what unaided MySQL can do. It remains to be seen for how many users the desire to keep things simple and stay with MySQL outweighs the desire to avoid custom hardware. Having Oracle salespeople all over those accounts surely wouldn’t help. Kickfire also has a second market, namely OEM vendors who are mainly interested in the superfast chip. That would probably be pretty unaffected by Oracle.
- Tokutek offers a technical proposition that’s hard to match head-on without going the CEP route. Users who care are likely to be MySQL shops. Tokutek’s main challenge is to prove that it sufficiently outdoes competing technical strategies for sufficiently many users. Oracle ownership of MySQL seems pretty irrelevant to Tokutek’s success or failure.
- Calpont offers a kind of lightweight Exadata alternative. With Calpont’s packaging and positioning perennially unclear, it’s difficult to predict the effect of a particular change — i.e., Oracle buying MySQL — in Calpont’s market environment.
- I haven’t heard from transactionally-oriented ScaleDB since I wrote about them a year ago. Apparently, they’re rolling out beta product this week, and their venerable techie guru sadly passed away earlier this month.