RDF and graphs

Analysis of data management technology optimized for RDF-formatted and/or graph data.

September 23, 2008

Oracle spotlights its datatype support

Oracle put out a flurry of press releases today in conjunction with Oracle OpenWorld. One, which was simply positioned as a report on some “mission-critical” customer apps, caught my eye because all four detailed examples involved nonstandard datatypes:

Two Oracle Spatial
One “semantic,” which in Oracle lingo seems to mean — you guessed it — RDF
One DICOM, which seems to be a medical imaging datatype.

Categories: Data types, GIS and geospatial, Oracle, RDF and graphs

3 Comments

August 26, 2008

Known applications of MapReduce

Most of the actual MapReduce applications I’ve heard of fall into a few areas:

Text tokenization, indexing, and search
Creation of other kinds of data structures (e.g., graphs)
Data mining and machine learning

That covers all MapReduce apps I recall hearing about via commercial companies and users, and also includes most of what’s in the two big sources I found online. Read more

Categories: MapReduce, RDF and graphs, Text

16 Comments

February 16, 2008

Mike Stonebraker’s DBMS taxonomy

In a response to my recent five-part series on DBMS diversity, Mike Stonebraker has proposed his own taxonomy of data management technologies over on Vertica’s Database Column blog. (Edit: Some good stuff disappeared when Vertica nuked that blog.)

OLTP DBMSs focused on fast, reliable transaction processing

Analytic/Data Warehouse DBMSs focused on efficient load and ad-hoc query performance

Science DBMSs — after all MatLab does not scale to disk-sized arrays

RDF stores focused on efficiently storing semi-structured data in this format

XML stores focused on semi-structured data in this format

Search engines — the big players all use proprietary engines in this area

Stream Processing Engines focused on real-time StreamSQL

“Lean and Mean,” less-than-a-database engines focused on doing a small number of things very well (embedded databases are probably in this category)

MapReduce and Hadoop — after all Google has enough “throw weight” to define a category

He goes on to say that each will be architected differently, except that — as he already convinced me back in July — RDF will be well-managed by specialty data warehouse DBMS. Read more

Categories: Data types, Database diversity, Michael Stonebraker, Mid-range, OLTP, RDF and graphs, Theory and architecture

6 Comments

November 7, 2007

Vertica update – HP appliance deal, customer information, and more

Vertica quietly announced an appliance bundling deal with HP and Red Hat today. That got me quickly onto the phone with Vertica’s Andy Ellicott, to discuss a few different subjects. Most interesting was the part about Vertica’s customer base, highlights of which included:

Vertica’s claim to have “50” customers includes a bunch of unpaid licenses, many of them in academia.
Vertica has about 15 paying customers.
Based on conversations with mutual prospects, Vertica believes that’s more customers than DATAllegro has. (Of course, each DATAllegro sale is bigger than one of Vertica’s. Even so, I hope Vertica is wrong in its estimate, since DATAllegro told me its customer count was “double digit” quite a while ago.)
Most Vertica customers manage over 1 terabyte of user data. A couple have bought licenses showing they intend to manage 20 terabytes or so.
Vertica’s biggest customer/application category – existing customers and sales pipelines alike – is call detail records for telecommunications companies. (Other data warehouse specialists also have activity in the CDR area.). Major applications are billing assurance (getting the inter-carrier charges right) and marketing analysis. Call center uses are still in the future.
Vertica’s other big market to date is investment research/tick history. Surely not coincidentally, this is a big area of focus for Mike Stonebraker, evidently at both companies for which he’s CTO. (The other, of course, is StreamBase.)
Runners-up in market activity are clickstream analysis and general consumer analytics. These seem to be present in Vertica’s pipeline more than in the actual customer base.

Categories: Analytic technologies, Business Objects, Data warehouse appliances, Data warehousing, DATAllegro, HP and Neoview, RDF and graphs, Vertica Systems

5 Comments

July 13, 2007

Nonstandard data management software — beyond the Bowling Alley?

I just finished a short Monash Letter on markets for nonstandard data management software. Of course, the whole thing is available only to Monash Advantage members, but here are some salient points:

When new kinds of data are managed, new kinds of data management are used. More precisely, the old ways are tried first — but once they fail new technologies are tried out.
Up through the “Bowling Alley,” markets for nonstandard data management technology commonly follow the classic Geoffrey Moore pattern. However, they rarely experience a “Tornado” or mass adoption.
I think this is apt to change. My three strongest candidates are native XML, RDF, and memory-centric event/stream processing used for data reduction (as opposed to sub-millisecond latency, which I do think will continue to be a niche requirement).

Categories: Memory-centric data management, RDF and graphs, Streaming and complex event processing (CEP), Structured documents

Fast RDF in specialty relational databases

When Mike Stonebraker and I discussed RDF yesterday, he quickly turned to suggesting fast ways of implementing it over an RDBMS. Then, quite characteristically, he sent over a paper that allegedly covered them, but actually was about closely related schemes instead. 🙂 Edit: The paper has a new, stable URL. Hat tip to Daniel Abadi.

All minor confusion aside, here’s the story. At its core, an RDF database is one huge three-column table storing subject-property-object triples. In the naive implementation, you then have to join this table to itself repeatedly. Materialized views are a good start, but they only take you so far. Read more

Categories: Columnar database management, Data models and architecture, Data warehousing, Database compression, RDF and graphs, Theory and architecture, Vertica Systems

1 Comment

June 15, 2007

RDF “definitely has legs”

Thus spake Mike Stonebraker to me, on a call we’d scheduled to talk about several other things altogether. This was one day after I was told at the Text Analytics Summit that the US government is going nuts for RDF. And I continue to get confirmation of something I first noted last year — Oracle is pushing RDF heavily, especially in the life sciences market.

Evidently, the RDF data model is for real … unless, of course, you’re the kind of purist who cares to dispute whether RDF is a true “data model” at all.

Categories: Data models and architecture, Oracle, RDF and graphs, Theory and architecture

Comments Off

May 7, 2007

More academic hype about the Semantic Web

A major Semantic Web researcher has built a cluster that can do RDF queries, and hence can get subsecond response time on queries against a database of 7 billion three-column records, The Register obsequiously reports. Golly gee whiz wow.

“The importance of this breakthrough cannot be overestimated,” said Professor Stefan Decker, director of DERI.”

I actually think the Semantic Web contains some good ideas, but this kind of over-the-top breathlessness doesn’t seem to do anybody very much good.

Categories: RDF and graphs

3 Comments

December 27, 2006

Bulletin on Cogito

My Bulletin on Cogito — i.e., a short-short white paper — is now available for download. Thankfully, it turned out to be pretty consistent with what I previously wrote on the company and its technology. 😉 The conclusion to the paper bears quoting here:

In deciding between conventional DBMS and specialty graph-oriented tools such as Cogito’s, there’s one key criterion: Path length. If path lengths are short and predictable, there’s a good chance that relational DBMS and their forthcoming extensions can do the job. In complex graphs with longer paths, however, relational approaches may not scale well. In such cases, specialty technologies warrant serious consideration.

Categories: Cogito and 7 Degrees, RDF and graphs

Oracle, graphical data models, and RDF

I wrote recently of Cogito’s high-performance engine for modeling graphs. Oracle has taken a very different approach to the same problem, and last Monday I drove over to Burlington to be briefed on it.

Name an approach to data management, and Oracle has probably

Hacked together a version on a consulting contract
Packaged it up for other customers in the same industry
Set to work on improving and generalizing it
Integrated it into SQL as a preference over supporting standalone data manipulation languages for it
Stopped short of being 100% competitive in that functionality

(At least, that’s the general template; truth be told, most of the important cases deviate in some way or other.)
Read more

Categories: Oracle, RDF and graphs

2 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in