July 24, 2008
Other early coverage of Microsoft/DATAllegro
- Here’s the official press release on DATAllegro’s site, and Microsoft’s.
- Doug Henschen of Intelligent Enterprise has a good article. He got quotes from Microsoft claiming that SQL Server on its own would be able to handle 10s of terabytes of data in the next release, but DATAllegro was needed to get up to the 100s of terabytes. That said, the quotes don’t say whether that’s user data or total disk usage — the latter frankly seems more plausible.
- James Kobielus of Forrester has a long post on the Microsoft/DATAllegro deal, emphasizing product packaging issues and glossing over technological differentiators. (Edit: The post seems down as of Friday midday.)
- This is a few weeks old, but Kevin Closson is extremely skeptical of some of DATAllegro’s technical claims. (Not that it matters much if he’s right — more nodes = more throughput, no matter how much Oracle folks rant.)
- Eric Lai of Computerworld gets it right.
- Larry Dignan thinks the acquisition is part of an overall strong Microsoft enterprise push.
- William McKnight thinks Microsoft usually does a good job of integrating acquisitions.
- DATAllegro CEO Stuart Frost is happy.
- David Hunter thinks Microsoft will blithely continue with DATAllegro’s limited-hardware-support strategy. He’s almost certainly wrong.
- Philip Howard says almost nothing I agree with, although I can’t argue with the part
Conversely, it’s bad news for Ingres, bad news for Oracle, bad news for IBM, bad news for Teradata and bad news for HP, all for obvious reasons. As for the other appliance vendors: they will not be too happy either. In particular, we now have to consider who can survive on their own, who might be acquired, who might do the acquiring, and who is going to disappear.
Comments
15 Responses to “Other early coverage of Microsoft/DATAllegro”
Leave a Reply
The link to “long post” by James Kobelius of Forrester is broken.
Thanks, Nair. It’s broken even when I Google to try to find the “right” one.
CAM
Curt,
Phil Howard mentions the commoditization of large-scale data warehousing. From my point of view, that happened a long time ago. It’s the end of history. There are no more MOLAP/ROLAP, 3NF/Star or Data Mart/EDW debates. In fact, we hardly ever hear anyone debate SMP vs MPP or MPP with SMP anymore. The “Single Version of the Truth” myth has gone largely unchallenged.
At the practice level, a handful of rules of thumb substitute for creativity and best practices have overtaken the DW industry. Implementations are formulaic and follow guidelines that are practically canonical. Data governance has scoped out its turf and pushed DW even further into the IT camp (despite their veneer of of being “business driven”). It’s boring.
It amazes me that a second-tier database provider (Microsoft) acquires a tiny vendor like Datallegro, and the blogs are disgorging all sorts of predictions about the industry. I’ll wait for Microsoft’s next release (SQl Server 2013?) and withhold judgment. In the meantime, there is a burgeoning need for people to start rethinking the whole tired data warehouse proposition, and all of this ink is spent on who’s zooming who.
-Neil
Neil,
I disagree with most of that, obviously.
When it is possible with multiple market-leading products to store ALL your data, run ANY query against it, do all that with only minor administrative efforts, and not worry about the cost of any part of the process — THEN data warehousing will be a commodity.
We’re not at that point yet.
CAM
Curt,
We’re talking about two different things. What you’re describing is the commoditization (actually, the perfection) of the data warehousing platform. I was referring to data warehousing practice, which has become mind-numbingly commoditized, hence, my railing against all of this discussion of the industry and products instead of the practice (and results) of DW/BI.
I believe that most existing data warehouses are underpowered and far short of their potential and that many of them need remediation at best, scrap and replacement at worst as they cannot provide the speed, scale, throughput or breadth needed today. That’s good news for the MPP guys, I guess, but that’s more your call than mine.
-NR
Neil,
Is it really data warehousing in any form that you’re complaining about, or is it BI?
CAM
Microsoft покупает DATAllegro…
На мой взгляд пока как то незаметно в российском инете прошло достаточно важно событие – Microsoft объявил…
Curt,
I don’t know how to answer that question “in any form.” Perhaps there are peripheral forms I’m not aware of. But I think I’ve made myself abundantly clear. The practice of data warehousing is largely formulaic. I tried to draw the distinction between the industry and the practice in response to Phil’s comments. Unfortunately, blogs that discuss the industry, not the practice, get undue attention. But then, so do reality shows.
BI is its own big mess, I shouldn’t have commingled the two.
Also, I guess I take some offense at your use of the term “complaining.”
-NR
Neil,
Considering that your comments seem to be highly critical of, and also to mischaracterize, blogs like this one, I’ll live at peace if you take some offense at my reply.
CAM
Curt,
So I’ll interpret that as indicating you have no argument. Data warehousing is NOT becoming commoditized…because you said so.
I’ll look elsewhere for enlightenment.
-NR
Neil,
First you say that nobody’s debating SMP vs. MPP any more, even though that’s a major topic of this blog, which sort of equates to you calling me a nobody. Then you say you’re not really talking about the technology after all, but rather “practice”, whatever that is that’s separate from BI (and from things like data governance, which if I recall correctly you were also objecting to).
If you took the trouble to be remotely clear, I might or might not agree with your opinion. As matters stand, you’re right — I have no argument with what you’re saying. In fact, I’ve wasted altogether too much time trying to puzzle it out.
CAM
Sometimes I just have to laugh at all these discussions that are so centered around volume my database is bigger than yours arguments. A lot of times the limits you hit have very little to do with size. Workload complexity is just as important a factor. I have a Teradata database that if 4TB in size and we have spent an inordinate amount of time trying to tune this thing. Some days we are lucky if our queries don’t just time out. And trust me we have had every expert on the planet look at this thing and tune it. So I am not wetting my pants just yet because Microsoft is going MPP. I have seen DatAllegro in action and yes it can load data exteremly fast but the software is so buggy and unstable that it couldn’t even do some basic SQL well without crashing a few times. Just changing the SQL Server architecture to MPP is not going to automatically elevate them into the first tier category. Being able to load data x times faster doesn’t mean a thing unless you have an excellent optimizer and super fast query performance and as I have already noted even top tier vendors like Teradata have trouble with sophisticated workloads.
Sanjay,
I agree with much more of that than I disagree with. Most of the Very Large warehouses tend to actually be data marts, simpler than some of the smaller ones. So being able to perform decently at a low price on a typical 100 TB+ “data warehouse” is no proof of ability to perform well on all 2-5 TB warehouses.
I don’t know of any data warehouse product that seems to truly be optimized for general, complex queries. Instead, they use one or both of two main techniques:
A. Making limiting assumptions about the nature of the queries, and exploiting those for great performance.
B. Striving to be superfast on the more elementary queries complex ones ultimately decompose to, and hoping this ripples through to decent performance on everything the user needs.
Curt,
I agree with all of that, including points A and B above. In fact, I wrote a white paper about it last year. I do believe, however, that some vendors are working, and have been working, very hard to come up with more durable optimizers to deal with mixed workloads. In the meantime, they must, as you say, make limiting assumptions. I’m not sure there is an alternative, really, it’s just the degree of limiting involved.
Sanjay’s point is well-taken. M-P’ing SQL Server isn’t going to address this problem. I suspect there is more to this than we’ve heard so far. When IBM brought out the SP2, we were pretty hopeful since DB2 had a pretty good optimizer (compared to its competition) and so did Informix, but we never got the performance we hoped for. Far from it.
Any fool can build a 100Tb data warehouse, and many have. It isn’t hard to throw a hundred processors at it and get “linear” scale up, so long as you don’t run more than one query at a time. But getting something like that to perform in a real operation, with bursty, bunched up and mixed query types is an NP hard problem.
[…] На мой взгляд пока как то незаметно в российском инете прошло достаточно важно событие – Microsoft объявил о покупке DATAllegro. Оригинальный пресс-релиз можно почитать вот тут вот – Microsoft to Acquire DATAllegro (Leaders in data warehousing team to provide large-scale business intelligence solutions.) Первую реакцию на данное событие можно почитать вот тут вот – Other early coverage of Microsoft/DATAllegro. […]