September 24, 2012
Notes on Hadoop adoption
I successfully resisted telephone consulting while on vacation, but I did do some by email. One was on the oft-recurring subject of Hadoop adoption. I think it’s OK to adapt some of that into a post.
Notes on past and current Hadoop adoption include:
- Enterprise Hadoop adoption is for experimental uses or departmental production (as opposed to serious enterprise-level production). Indeed, it’s rather tough to disambiguate those two. If an enterprise uses Hadoop to search for new insights and gets a few, is that an experiment that went well, or is it production?
- One of the core internet-business use cases for Hadoop is a many-step ETL, ELT, and data refinement pipeline, with Hadoop executing some or many of the steps. But I don’t think that’s in production at many enterprises yet, except in the usual forward-leaning sectors of financial services and (we’re all guessing) national intelligence.
- In terms of industry adoption:
- Financial services on the investment/trading side are all over Hadoop, just as they’re all over any technology. Ditto national intelligence, one thinks.
- Consumer financial services, especially credit card, are giving Hadoop a try too, for marketing and/or anti-fraud.
- I’m sure there’s some telecom usage, but I’m hearing of less than I thought I would. Perhaps this is because telcos have spent so long optimizing their data into short, structured records.
- Whatever consumer financial services firms do, retailers do too, albeit with smaller budgets.
Thoughts on how Hadoop adoption will look going forward include:
- Enterprise adoption of Hadoop for ETL/ELT/data refinement could explode after more software vendors offer support for it.
- The Hadoop community is trying hard to make it easy(ier) to manage multiple Hadoop clusters as one (preferably with elasticity among them). That could lead to more enterprise-level Hadoop deployments.
- There will be very few cases of Hadoop replacing existing relational data warehouses. But Hadoop could get a sizable share of new opportunities that might otherwise go to scale-out analytic RDBMS.
- I think Hadoop will do a good job of subsuming some of the newer efforts that might otherwise threaten to replace it. (I’m not sure whether Dremel/Drill is a major example of same, but it illustrates the point in any case.)
- If data starts out in the cloud, then the right place to do Hadoop on it may be in the same cloud.
- Hadoop appliances have dubious value for customers; everybody has or should have similar software, and nobody’s adding much value in their hardware designs. Even so, Hadoop gear is basically cheap, so overpaying for it isn’t a big deal. Thus, an enduring Hadoop appliance market may emerge.
Categories: Cloud computing, Data warehouse appliances, Data warehousing, EAI, EII, ETL, ELT, ETLT, Hadoop, Investment research and trading, Telecommunications
Subscribe to our complete feed!
Comments
3 Responses to “Notes on Hadoop adoption”
Leave a Reply
Curt, I thnk it can be called “production” when it moves into regular repeated use – even when subsequent versions differ considerably. Successful experiments answer a question once; production systems are the gift that keeps on giving.
Simple but to the point as usual. I could not agree more about the comment on the cloud. If only IT could stop blindly applying the same security processes to every problem. just today I spent some time with the client where IT flat out refused to discuss doing a proof of concept on the cloud because it would raise the eye brow of their security people. The fact that the data came from the public Internet did not seem to make any difference.
[…] I’m not a big fan of Hadoop appliances. It’s not always obvious how appliance packaging adds enough value to make up for sacrifices […]