February 5, 2013

Comments on Gartner’s 2012 Magic Quadrant for Data Warehouse Database Management Systems — concepts

The 2012 Gartner Magic Quadrant for Data Warehouse Database Management Systems is out. I’ll split my comments into two posts — this one on concepts, and a companion on specific vendor evaluations.

Links:

Let’s start by again noting that I regard Gartner Magic Quadrants as a bad use of good research. On the facts:

When it comes to evaluations, however, the Gartner Data Warehouse DBMS Magic Quadrant doesn’t do as well. My concerns (which overlap) start:

Read more

January 26, 2013

Editing code is easier than writing it

I’ve hacked both the PHP and CSS that drive this website. But if I had to write PHP or CSS from scratch, I literally wouldn’t know how to begin.

Something similar, I suspect, is broadly true of “business analysts.” I don’t know how somebody can be a competent business analyst without being able to generate, read, and edit SQL. (Or some comparable language; e.g., there surely are business analysts who only know MDX.) I would hope they could write basic SELECT statements as well.

But does that mean business analysts are comfortable with the fancy-schmantzy extended SQL that the analytic platform vendors offer them? I would assume that many are but many others are not. And thus I advised such a vendor recently to offer sample code, and lots of it — dozens or hundreds of isolated SQL statements, each of which does a specific task.* A business analyst could reasonably be expected to edit any of those to point them his own actual databases, even though he can’t necessarily be expected to easily write such statements from scratch.  Read more

January 17, 2013

YCSB benchmark notes

Two different vendors recently tried to inflict benchmarks on me. Both were YCSBs, so I decided to look up what the YCSB (Yahoo! Cloud Serving Benchmark) actually is. It turns out that the YCSB:

That actually sounds pretty good, especially the extensibility part;* it’s likely that the YCSB can be useful in a variety of product selection scenarios. Still, as recent examples show, benchmark marketing is an annoying blight upon the database industry.

*With extensibility you can test your own workloads and do your own sensitivity analyses.

A YCSB overview page features links both to the code and to the original explanatory paper. The clearest explanation of the YCSB I found there was: Read more

January 16, 2013

NuoDB marketing mishegas

I must start by apologizing for giving a quote in a press release whose contents I deplore. Unlike occasions on which I’ve posted about inaccurate quotes, in this case the fault is mine. The quote is quite accurate. And NuoDB didn’t mislead me about the release’s contents; I just neglected to ask.

NuoDB evidently subscribes to the marketing fallacy:

But to my taste, NuoDB’s worst travesty is not the deafening drumroll before launch (I asked off their mailing list months before), nor the claim that NuoDB’s launch would be a “big day” for the database industry (annoying but ordinary hype), nor the emergent flock of birds foofarah, nor even NuoDB’s overwrought benchmark marketing (distressingly many vendors do that).

Rather, I think NuoDB’s greatest marketing offense to date is its Codd-imitating “12 rules” for cloud database management. Read more

January 15, 2013

Tokutek update

Alternate title: TokuDB updates 🙂

Now that I’ve addressed some new NewSQL entrants, namely NuoDB and GenieDB, it’s time to circle back to some more established ones. First up are my clients at Tokutek, about whom I recently wrote:

Tokutek turns a performance argument into a functionality one. In particular, Tokutek claims that TokuDB does a much better job than alternatives of making it practical for you to update indexes at OLTP speeds. Hence, it claims to do a much better job than alternatives of making it practical for you to write and execute queries that only make sense when indexes (or other analytic performance boosts) are in place.

That’s all been true since I first wrote about Tokutek and TokuDB in 2009. However, TokuDB’s technical details have changed. In particular, Tokutek has deemphasized the ideas that:

Rather, Tokutek’s new focus for getting the same benefits is to provide a separate buffer for each node of a b-tree. In essence, Tokutek is taking the usual “big blocks are better” story and extending it to indexes. TokuDB also uses block-level compression. Notes on that include: Read more

January 12, 2013

Introduction to NuoDB

NuoDB has an interesting NewSQL story. NuoDB’s core design goals seem to be:

Read more

January 7, 2013

Introduction to GenieDB

GenieDB is one of the newer and smaller NewSQL companies. GenieDB’s story is focused on wide-area replication and uptime, coupled to claims about ease and the associated low TCO (Total Cost of Ownership).

GenieDB is in my same family of clients as Cirro.

The GenieDB product is more interesting if we conflate the existing GenieDB Version 1 and a soon-forthcoming (mid-year or so) Version 2. On that basis:

The heart of the GenieDB story is probably wide-area replication. Specifics there include:  Read more

January 5, 2013

NewSQL thoughts

I plan to write about several NewSQL vendors soon, but first here’s an overview post. Like “NoSQL”, the term “NewSQL” has an identifiable, recent coiner — Matt Aslett in 2011 — yet a somewhat fluid meaning. Wikipedia suggests that NewSQL comprises three things:

I think that’s a pretty good working definition, and will likely remain one unless or until:

To date, NewSQL adoption has been limited.

That said, the problem may lie more on the supply side than in demand. Developing a competitive SQL DBMS turns out to be harder than developing something in the NoSQL state of the art.

Read more

January 5, 2013

Data(base) virtualization — a terminological mess

Data/database virtualization seems to be a hot subject right now, and vendors of a broad variety of different technologies are all claiming to be in the space. A terminological mess has ensued, as Monash’s First and Third Laws of Commercial Semantics are borne out in spades.

If something is like “virtualization”, then it should resemble hypervisors such as VMware. To me:

Anything that claims to be “like virtualization” should be viewed in that light. Read more

December 13, 2012

Spark, Shark, and RDDs — technology notes

Spark and Shark are interesting alternatives to MapReduce and Hive. At a high level:

The key concept here seems to be the RDD. Any one RDD:

Otherwise, there’s a lot of flexibility; an RDD can be a set of tuples, a collection of XML documents, or whatever other reasonable kind of dataset you want. And I gather that:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.