MapReduce tidbits
I’ve never had children, and so have never had to supervise squabbling siblings, each accusing the other of selfishness and insufficient sharing. Perhaps the MapReduce vendors are a form of karmic payback. Be that as it may, my client Cloudera has organized Hadoop World on October 2 in New York, and my other client Aster Data is hosting a MapReduce-centric Big Data Summit the night before, at the same venue. Even if you don’t go, both conference’s agenda pages offer a peek into what’s going on in MapReduce applications. I’m not going either, but even so I hope to post an overview of MapReduce uses after the conferences serve to publicize some of them.
Even better, I plan to hold a couple of webinars on MapReduce, the first at 10 am (blech) and 1 pm Eastern time on October 15. They’re sponsored by Aster Data, and so will have a strong SQL/MapReduce orientation.
In connection with its conference, Aster is introducing an nCluster-Hadoop connector — i.e., a loader from HDFS (Hadoop Distributed File System) implemented in SQL/MapReduce. In particular:
- While Aster nCluster has a solid parallel load capability from SQL sources, I believe this is the first time Aster is doing parallel load from a source that doesn’t talk to it in SQL. (Presumably, an alternative would be for the Hadoop cluster to run Hive.) I don’t know how this compares to, say, Greenplum’s implementation of Scatter/Gather.
- Unlike other parallel loading in Aster nCluster, the nCluster-Hadoop connector bypasses the loader nodes and goes straight to the worker nodes.
- This is not a load utility; it’s just a SQL function.
Meanwhile, each of SenSage and Splunk told me last week that they’ve been doing what amounts to MapReduce under the covers since their respective Day 1s. Who knew? (More on each company later.)
And as previously noted, Netezza and Teradata are doing MapReduce too. One of the exhibit-hall videos at Netezza’s Enzee Universe conference tour mentioned MapReduce, but I’ll confess to never having stopped to check what it actually was saying.
Comments
7 Responses to “MapReduce tidbits”
Leave a Reply
I didn’t realize NZ, Tera and SenSage were doing MR as well. I really look forward to your webinars.
This is an area in which information/education channels has been lacking IMHO. Yes there’s been plenty of online material and papers on it but no centalized/big picture view that common mortal can wrap their heads around.
That’s one of the reasons I wanted to address it in my latest blog post, from a much higher altitude than I’m sure you will.
Thanks for bringing this forward in digestable ways!
It would be interesting to comment on real world use cases for Hadoop/MR for enterprise *structured* data. Google, Facebook and Yahoo have unique data requirements, but that’s 3 super high volume sites. OK so Hadoop is great for them. What about everyone else.
Are folks using it for call center monitoring? Sales territory assignment? Incentive comp calculations? Mapping journal lines to sub-ledgers? Fraud detection? Time-phased correlations?
If the answer is well, no, then why is it reasonable for database vendors to invest so much dev and mktg resources into such an extremely niche technology.
Fraud — absolutely.
Apps on which other technologies do a really great job — not so much.
[…] gaining traction, especially but by no means only in the form of Hadoop. In the aftermath of Hadoop World, Jeff Hammerbacher of Cloudera walked me quickly through 25 customers he pulled from […]
Thanks for the update. Omer at Vertica also clued me in. Hadoop embraces and extends the power of a relational db, not really opposing camps. Now, I need to get my hands down and dirty with it!
[…] the MapReduce world converged on New York late this week, Senior VP of Cloud Computing Shelton Shugar has been telling anybody who will listen that Yahoo is […]
[…] Aster Data’s Big Data Summit and Cloudera’s Hadoop Summit being held tonight and tomorrow, respectively, at the same NYC venue, large-scale analytics are as hot today (literally, Oct. 1, 2009) as ever. Particularly hot today […]