Discussion of Google’s data management technologies MapReduce and BigTable. Related subjects include:
- MapReduce
- (in Text Technologies) Google in search
- (in The Monash Report) Google Apps
More patent nonsense — Google MapReduce
Google recently received a patent for MapReduce. The first and most general claim is (formatting and emphasis mine): Read more
Categories: Google, MapReduce, Parallelization | 17 Comments |
Clearing up MapReduce confusion, yet again
I’m frustrated by a constant need — or at least urge 🙂 — to correct myths and errors about MapReduce. Let’s try one more time: Read more
Categories: Analytic technologies, Aster Data, Cloudera, Data warehousing, Google, Hadoop, MapReduce, SenSage, Splunk | 8 Comments |
Three big myths about MapReduce
Once again, I find myself writing and talking a lot about MapReduce. But I suspect that MapReduce-related conversations would go better if we overcame three fairly common MapReduce myths:
- MapReduce is something very new
- MapReduce involves strict adherence to the Map-Reduce programming paradigm
- MapReduce is a single technology
Categories: Analytic technologies, Aster Data, Cloudera, Data warehousing, Google, Greenplum, Hadoop, Log analysis, MapReduce, Michael Stonebraker, Parallelization, Web analytics | 11 Comments |
Google Fusion Tables
Google has announced an experimental cloud-based data management system called Fusion Tables. A press article and Slashdot thread ensued, based on some bizarre-sounding analyst quotes that I will not attempt to parse.
What Fusion Tables really seems to be is a spreadsheet without the formulae. That is, it’s a place to dump data in a grid of cells, comment on it, version it, and do elementary data manipulation. This could, I guess, be useful as an alternative to traditional RDBMS — assuming, of course, that you want to have a row-by-row debate about 100 megs of data.
Seriously, while Google Fusion Tables bears some vague resemblance to what I’m thinking about for the future of both business intelligence and data marts, it sounds as if it has a long way to go before it’s something most enterprises should spend time looking at.
Categories: Analytic technologies, Google, Theory and architecture | 1 Comment |
Reinventing business intelligence
I’ve felt for quite a while that business intelligence tools are due for a revolution. But I’ve found the subject daunting to write about because — well, because it’s so multifaceted and big. So to break that logjam, here are some thoughts on the reinvention of business intelligence technology, with no pretense of being in any way comprehensive.
Natural language and classic science fiction
Actually, there’s a pretty well-known example of BI near-perfection — the Star Trek computers, usually voiced by the late Majel Barrett Roddenberry. They didn’t have a big role in the recent movie, which was so fast-paced nobody had time to analyze very much, but were a big part of the Star Trek universe overall. Star Trek’s computers integrated analytics, operations, and authentication, all with a great natural language/voice interface and visual displays. That example is at the heart of a 1998 article on natural language recognition I just re-posted.
As for reality: For decades, dating back at least to Artificial Intelligence Corporation’s Intellect, there have been offerings that provided “natural language” command, control, and query against otherwise fairly ordinary analytic tools. Such efforts have generally fizzled, for reasons outlined at the link above. Wolfram Alpha is the latest try; fortunately for its prospects, natural language is really only a small part of the Wolfram Alpha story.
A second theme has more recently emerged — using text indexing to get at data more flexibly than a relational schema would normally allow, either by searching on data values themselves (stressed by Attivio) or more by searching on the definitions of pre-built reports (the Google OneBox story). SAP’s Explorer is the latest such view, but I find Doug Henschen’s skepticism about SAP Explorer more persuasive than Cindi Howson’s cautiously favorable view. Partly that’s because I know SAP (and Business Objects); partly it’s because of difficulties such as those I already noted.
Flexibility and data exploration
It’s a truism that each generation of dashboard-like technology fails because it’s too inflexible. Users are shown the information that will provide them with the most insight. They appreciate it at first. But eventually it’s old hat, and when they want to do something new, the baked-in data model doesn’t support it.
The latest attempts to overcome this problem lie in two overlapping trends — cool data exploration/visualization tools, and in-memory analytics. Read more
Categories: Analytic technologies, Business intelligence, Google, Memory-centric data management, Microsoft and SQL*Server, SAP AG | 19 Comments |
High-end MySQL use
To a large extent, MySQL lives in two different alternate universes from most other DBMS. One is for low-end, simple database applications. For example, of all the DBMS I write about, MySQL is the one I actually use in my own business — because MySQL sits underneath WordPress, and WordPress is what runs my blogs. My largest database (the one for DBMS2) contains 12 megabytes of data in 11 tables, none of which has yet reached 5000 rows in size. Read more
Categories: Google, MySQL, OLTP, Open source, Parallelization | 1 Comment |
Google has thousands of internal data formats, mostly simple ones
In connection with the release of Protocol Buffers, Kenton Varda of Google wrote: Read more
Categories: Data integration and middleware, Google | 2 Comments |
More Google reliability woes
Google’s reliability issues are ever worse. As I previously pointed out, this is evidence against the notion that MapReduce is a replacement for established DBMS.
Categories: Google | 2 Comments |
Oracle/Google/Apple merger – wow! Just — wow.
If rumors are to be believed, Oracle, Google, and Apple are close to agreeing on a mega-blockbuster three-way merger. Just the personality combinations are amazing, starting with close friends Jobs and Ellison — perhaps the two greatest entrepreneurs of Silicon Valley, and both with impeccable taste – and the traditionally sloppy, generation-younger Page and Brin. But let’s jump straight to some of the possible business and technology ramifications.
The Macintosh could become a serious Windows competitor. The Mac is quietly making an enterprise comeback anyway. Business intelligence, dashboards, and the like are constantly in the throes of UI re-invention. (I have some articles I the works about why the industry never seem to get them right, but in the mean time here is my UI overview article from last year.)
Whole new generations of personal/pervasive computing devices could evolve. Apple obviously is a huge personal-electronic-device player with the iPod and upcoming iPhone. Google has looked into cell phones as well. Designing cool devices will not be a problem. The issue is making them integrate really well with enterprise systems. I favor speech interfaces, myself.
Enterprise information management could be transformed. Oracle is batting about 0-for-the-decade in search. Google has is selling a lot of not-terribly-useful low-end enterprise search boxes. There’s room for both to do a lot better. Ex-Oracle executive Dennis Moore has some good ideas in that regard.
Related link
- Scoble has details on part of the story.
There’s one catch, however: On April 1, rumors generally should not be taken too seriously.
Categories: Google, Oracle | 2 Comments |