Is the enterprise data warehouse a myth?
An enterprise data warehouse should:
- Manage data to high standards of accuracy, consistency, cleanliness, clarity, and security.
- Manage all the data in your organization.
Pick ONE.
There’s little to dislike in the enterprise data warehouse dream, as represented (for example) in this 2004 Teradata Magazine article. But in a world where ever more data comes in from ever more sources – and is needed ever faster – it simply isn’t realistic to expect that all an enterprise’s data will be vetted, organized, and managed to the highest of standards.
This is a core premise of Greenplum’s Enterprise Data Cloud (EDC)/Chorus marketing initiative, and in that respect Greenplum is correct.
If the EDW is a great idea that can never be 100% implemented, what should you do? At conventional enterprises, the answer is pretty obvious: Manage some of your data to enterprise data warehouse standards, but not all of it. Specifically, your highest-value data should be in something that looks like a classic enterprise data warehouse, and your lower-value data shouldn’t.
Of course, if you’re a data mart outsourcer or other analytic service provider, whose data is about your customers’ businesses rather than your own, and whose business is managing your customers’ data, this may not apply to you. But otherwise it’s a position with many supporting arguments, including:
- Financial reporting, compliance, and other legitimate concerns introduce rigidity into data models. This increases the cost and reduces the speed of getting data into enterprise data warehouses.
- Data governance procedures imposed for any other business purpose have the same effect. What’s deemed necessary for enterprise data warehouses can be fatal to timely analytics.
- The highest-value data typically comes from transactional systems, such as order entry or sales contact management. So it starts out with a degree of governance that, say, web log files may never enjoy.
- In some enterprises, it is affordable or even cost-effective to manage your highest-value data in your favorite big-brand DBMS, but necessary to manage most of your data in something with lower TCO (Total Cost of Ownership). Big-brand OLTP DBMS are often better (or at least less bad) at managing enterprise data warehouses than they are at running data mart workloads.
- At certain enterprise and database sizes, it may indeed make sense to run what amounts to an enterprise data warehouse out of the same database instance that does OLTP, while putting larger data sets into more cost-effective data marts. A trend to “operational BI” may actually make that option more appealing going forward than it has been in the past.
- And finally, there’s the empirical fact that not one really large enterprise on the whole planet has a true, perfectly comprehensive enterprise data warehouse. At least, I’ve never heard of one.
Related links
- Even Teradata doesn’t push an EDW-only strategy any more
- I agreed when Greenplum first started pushing the EDC idea that something like it would be the future of data marts
Comments
8 Responses to “Is the enterprise data warehouse a myth?”
Leave a Reply
The EDW challenge in that “high-value” data is transformed from big “low value” data. Granted, ETL is an growing bottleneck that needs to be solved, but let’s not through the baby out with the bath water. We need disruptive breakthroughs in streaming data through the transformation process. Streaming analytics is a start, but why not stream the ELT to increase the value of all EDW data.
The EDW challenge in that “high-value” data is transformed from big “low value” data. Granted, ETL is an growing bottleneck that needs to be solved, but let’s not throw the baby out with the bath water. We need disruptive breakthroughs in streaming data through the transformation process. Streaming analytics is a start, but why not stream the ELT to increase the value of all EDW data.
[…] Is the enterprise data warehouse a myth? | DBMS2 — DataBase Management System Services […]
[…] For as long as we’ve had the concept of database management, there’s been a debate as to whether it is realistic for large enterprises to have a single Grand Unified Enterprise Storehouse Of All Information, or whether database proliferation actually makes sense. This argument has been particularly intense in the area of data warehouse/data marts. I’m generally on the side of data mart proliferation. […]
[…] pro-EDW arguments, in more detail than I ever have. So this feels like a good time to revisit my answer to the question of the EDW’s role, whose money quote was: At conventional enterprises … Manage some of your data to enterprise […]
[…] pointed out last year that the grand central enterprise data warehouse couldn’t happen; the post started: An enterprise data warehouse […]
[…] All that plugs into a larger project I was working on before my family issues came crashing in. The enterprise data warehouse is a myth, and that’s just the first reason that the old EDW vs. data mart bifurcation is grossly […]
[…] As far back as 2010, industry analyst Curt Monash put it succinctly as “At conventional enter-prises, the answer is pretty obvious: Manage some of your data to enterpris…” […]