Parallelization

Analysis of issues in parallel computing, especially parallelized database management. Related subjects include:

October 23, 2011

Transparent relational OLTP scale-out

There’s a perception that, if you want (relatively) worry-free database scale-out, you need a non-relational/NoSQL strategy. That perception is false. In the analytic case it’s completely ridiculous, as has been demonstrated by Teradata, Vertica, Netezza, and various other MPP (Massively Parallel Processing) analytic DBMS vendors. And now it’s false for short-request/OLTP (OnLine Transaction Processing) use cases as well.

My favorite relational OLTP scale-out choice these days is the SchoonerSQL/dbShards partnership. Schooner Information Technology (SchoonerSQL) and Code Futures (dbShards) are young, small companies, but I’m not too concerned about that, because the APIs they want you to write to are just MySQL’s. The main scenarios in which I can see them failing are ones in which they are competitively leapfrogged, either by other small competitors – e.g. ScaleBase, Akiban, TokuDB, or ScaleDB — or by Oracle/MySQL itself. While that could suck for my clients Schooner and Code Futures, it would still provide users relying on MySQL scale-out with one or more good product alternatives.

Relying on non-MySQL NewSQL startups, by way of contrast, would leave me somewhat more concerned. (However, if their code is open sourced. you have at least some vendor-failure protection.) And big-vendor scale-out offerings, such as Oracle RAC or DB2 pureScale, may be more complex to deploy and administer than the MySQL and NewSQL alternatives.

October 23, 2011

Schooner pivots further

Schooner Information Technology started out as a complete-system MySQL appliance vendor. Then Schooner went software-only, but continued to brag about great performance in configurations with solid-state drives. Now Schooner has pivoted further, and is emphasizing high availability, clustered performance, and other hardware-agnostic OLTP (OnLine Transaction Processing) features. Fortunately, Schooner has some interesting stuff in those areas to talk about.

The short form of the SchoonerSQL (as Schooner’s product is now called) story goes roughly like this:

Read more

October 11, 2011

IBM is buying parallelization expert Platform Computing

IBM is acquiring Platform Computing, a company with which I had one briefing, last August. Quick background includes:  Read more

October 4, 2011

Cloudera versus Hortonworks

A few weeks ago I wrote:

The other big part of Hortonworks’ story is the claim that it holds the axe in Apache Hadoop development.

and

… just how dominant Hortonworks really is in core Hadoop development is a bit unclear. Meanwhile, Cloudera people seem to be leading a number of Hadoop companion or sub-projects, including the first two I can think of that relate to Hadoop integration or connectivity, namely Sqoop and Flume. So I’m not persuaded that the “we know this stuff better” part of the Hortonworks partnering story really holds up.

Now Mike Olson — CEO of my client Cloudera — has posted his analysis of the matter, in response to an earlier Hortonworks post asserting its claims. In essence, Mike argues:

Read more

September 30, 2011

Oracle NoSQL is unlikely to be a big deal

Alex Williams noticed that there will be a NoSQL session at Oracle OpenWorld next week, and is wondering whether this will be a big deal. I think it won’t be.

There really are three major points to NoSQL.

Oracle can address the latter two points as aggressively as it wishes via MySQL. It so happens I would generally recommend MySQL enhanced by dbShards, Schooner, and/or dbShards/Schooner, rather than Oracle-only MySQL … but that’s a detail. In some form or other, Oracle’s MySQL is a huge player in the scale-out, open source, short-request database management market.

So that leaves us with dynamic schemas. Oracle has at least four different sets of technology in that area:

If Oracle is now refreshing and rebranding one or more of these as “NoSQL”, there’s no reason to view that as a big deal at all.

*That’s Mike Olson’s former company, if you’re keeping score at home.

September 23, 2011

Some notes on Hadoop (mainly) and appliances

1. EMC Greenplum has evolved its appliance product line. As I read that, the latest announcement boils down to saying that you can neatly network together various Greenplum appliances in quarter-rack increments. If you take a quarter rack each of four different things, then Greenplum says “Hooray! Our appliance is all-in-one!” Big whoop.

2. That said, the Hadoop part of EMC ‘s story is based on MapR, which so far as I can tell is actually a pretty good Hadoop implementation. More precisely, MapR makes strong claims about performance and so on, and Apache Hadoop folks don’t reply “MapR is full of &#$!” Rather, they say “We’re going to close the gap with MapR a lot faster than the MapR folks like to think — and by the way, guys, thanks for the butt-kick.” A lot more precision about MapR may be found in this M. C. Srivas SlideShare.

3. On its latest earnings call, Oracle clearly said it would introduce a Hadoop appliance, versus just hinting at a Hadoop appliance the prior quarter. The money quote was:  Read more

September 19, 2011

Are there any remaining reasons to put new OLTP applications on disk?

Once again, I’m working with an OLTP SaaS vendor client on the architecture for their next-generation system. Parameters include:

So I’m leaning to saying:   Read more

September 12, 2011

Hadoop notes

I visited California recently, and chatted with numerous companies involved in Hadoop — Cloudera, Hortonworks, MapR, DataStax, Datameer, and more. I’ll defer further Hadoop technical discussions for now — my target to restart them is later this month — but that still leaves some other issues to discuss, namely adoption and partnering.

The total number of enterprises in the world paying subscription and license fees that they would regard as being for “Hadoop or something Hadoop-related” probably is not much over 100 right now, but I’d expect to see pretty rapid growth. Beyond that, let’s divide customers into three groups:

Hadoop vendors, in different mixes, claim to be doing well in all three segments. Even so, almost all use cases involve some kind of machine-generated data, with one exception being a credit card vendor crunching a large database of transaction details. Multiple kinds of machine-generated data come into play — web/network/mobile device logs, financial trade data, scientific/experimental data, and more. In particular, pharmaceutical research got some mentions, which makes sense, in that it’s one area of scientific research that actually enjoys fat for-profit research budgets.

Read more

September 11, 2011

“Big data” has jumped the shark

I frequently observe that no market categorization is ever precise and, in particular, that bad jargon drives out good. But when it comes to “big data” or “big data analytics”, matters are worse yet. The definitive shark-jumping moment may be Forrester Research’s Brian Hopkins’ claim that:

… typical data warehouse appliances, even if they are petascale and parallel, [are] NOT big data solutions.

Nonsense almost as bad can be found in other venues.

Forrester seems to claim that “big data” is characterized by Volume, Velocity, Variety, and Variability. Others, less alliteratively-inclined, might put Complexity in the mix. So far, so good; after all, much of what people call “big data” is collections of disparate data streams, all collected somewhere in a big bit bucket. But when people start defining “big data” to include Variety and/or Variability, they’ve gone too far.

Read more

August 26, 2011

Virtual data marts in Sybase IQ

I made a few remarks about Sybase IQ 15.3 when it became generally available in July. Now that I’ve had a current briefing, I’ll make a few more.

The key enhancement in Sybase IQ 15.3 is distributed query — what others might call parallel query — aka PlexQ. A Sybase IQ query can now be distributed among many nodes, all talking to the same SAN (Storage-Area Network). Any Sybase IQ node can take the responsibility of being the “leader” for that particular query.

In itself, this isn’t that impressive; all the same things could have been said about pre-Exadata Oracle.* But PlexQ goes somewhat further than just removing a bottleneck from Sybase IQ. Notably, Sybase has rolled out a virtual data mart capability. Highlights of the Sybase IQ virtual data mart story include:   Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.