Data types

Analysis of data management technology optimized for specific datatypes, such as text, geospatial, object, RDF, or XML. Related subjects include:

January 28, 2008

Is anybody actually using image, video, or sound indexing?

I have quite the excess of “flu-like symptoms,” and nothing substantive I’m writing today is coming to fruition. So instead of forcing the issue, I’m going to put a few questions out for discussion.

Question of the day #1

Is anybody indexing the actual contents of still images, video, or sound files?

Obviously, there are applications that serve huge numbers of videos, pictures, and/or songs — YouTube, Flickr, iTunes, and so on. But generally, these media are just handled as files or BLOBs, while all the database indexing is on alphanumeric metadata such as title, tags, uploader, date, download stats, comments, and so on.

The technology certainly exists to be more sophisticated. Consider, for example, Oracle’s Still Image datatype, which in typical Oracle fashion implements the relevant parts of SQL/MM and goes yet further. Read more

January 27, 2008

The 4 main approaches to datatype extensibility

Based on a variety of conversations – including some of the flames about my recent confession that mid-range DBMS aren’t suitable for everything — it seems as if a quick primer may be in order on the subject of datatype support. So here goes.

“Database management” usually deals with numeric or alphabetical data – i.e., the kind of stuff that goes nicely into tables. It commonly has a natural one-dimensional sort order, which is very useful for sort/merge joins, b-tree indexes, and the like. This kind of tabular data is what relational database management systems were invented for.

But ever more, there are important datatypes beyond character strings, numbers and dates. Leaving out generic BLOBs and CLOBs (Binary/Character Large OBjects), the big four surely are:

Numerous other datatypes are important as well, with the top runners-up probably being images, sound, video, time series (even though they’re numeric, they benefit from special handling).

Four major ways have evolved to manage data of non-tabular datatype, either on their own or within an essentially relational data management environment. Read more

January 24, 2008

Is MapReduce a good underpinning for next-gen scientific DBMS?

Back in November, Mike Stonebraker suggested that there’s a need for database management advances to serve “big science”. He said:

Obviously, the best solution to these … problems would be to put everything in a next-generation DBMS — one capable of keeping track of data, metadata, and lineage. Supporting the latter would require all operations on the data to be done inside the DBMS with user-defined functions — Postgres-style.

Read more

December 17, 2007

Intersystems’ stealth marketing has gotten pretty extreme

Every few months I try to make contact with Intersystems. Sometimes they graciously respond, promising to schedule a briefing, which then never happens. Other times they don’t even bother. Now, on one level I can’t blame them, based on what happened at my last briefing. Read more

December 8, 2007

Status of Software AG’s Tamino

Since I was researching Software AG anyway, I took the opportunity to ask about Software AG’s native XML DBMS Tamino, which certainly has some fans. Jim Fowler, Software AG’s Director of Market Development, Enterprise Transaction Systems, was kind enough to write up the following for me:

As you know, when Tamino was released in the late 1990s it was one of the first – if not the first – commercially available native XML database. We now have several hundred Tamino customers worldwide, and Software AG is fully committed to supporting our customers.

At the same time, we recognize that XML has matured and evolved in many different directions during the past decade; Read more

December 5, 2007

A nice EnterpriseDB replacement of MySQL

I’m going to praise EnterpriseDB’s marketing communications twice in two blog posts, because I really liked some of the crunch they put into a press release announcing a MySQL replacement at FortiusOne. To wit (emphasis mine):

The PostGIS geospatial extensions to PostgreSQL played a key role in FortiusOne’s selection of EnterpriseDB Advanced Server, a PostgreSQL-based solution, and dramatically improved performance. FortiusOne needed to run complex spatial queries against large datasets quickly and efficiently, and found the MySQL spatial extensions to be far less complete and comprehensive than PostGIS. EnterpriseDB Advanced Server processes some of GeoCommons’ database-intensive rendering requests in one-thirtieth of the time required by MySQL. During peak loads, GeoCommons processes more than one hundred thousand complex requests per hour, requiring true enterprise-class performance and scalability.

Another major factor in FortiusOne’s replacement of MySQL with EnterpriseDB Advanced Server was the company’s need for advanced partitioning, custom triggers, and functional indexing. EnterpriseDB’s advanced partitioning capabilities instantly enabled linear performance, even with tables having billions of rows.

Read more

November 7, 2007

Vertica update – HP appliance deal, customer information, and more

Vertica quietly announced an appliance bundling deal with HP and Red Hat today. That got me quickly onto the phone with Vertica’s Andy Ellicott, to discuss a few different subjects. Most interesting was the part about Vertica’s customer base, highlights of which included:

Read more

October 22, 2007

Native XML performance, and Philip Howard on recent IBM DBMS announcements

Philip Howard went to at least one conference this month I didn’t, namely IBM’s, and wrote up some highlights. As usual, he seems to have been favorably impressed.

In one note, he says that IBM is claiming a 2-5X XML performance improvement. This is a good step, since one of my clients who evaluated such engines dismissed IBM early on for being an order of magnitude too slow. That client ultimately chose Marklogic, with Cache’ having been the only other choice to make the short list.

Speaking of IBM, I flew back from the Business Objects conference next to a guy who supports IMS. He told me that IBM has bragged of an actual new customer win for IMS within the past couple of years (a large bank in China). Read more

September 27, 2007

The Netezza Developer Network

Netezza has officially announced the Netezza Developer Network. Associated with that is a set of technical capabilities, which basically boil down to programming user-defined functions or other capabilities straight onto the Netezza nodes (aka SPUs). And this is specifically onto the FPGAs, not the PowerPC processors. In C. Technically, I think what this boils down to is: Read more

August 12, 2007

Applications for not-so-low-latency CEP

The highest-profile applications for complex event/stream processing are probably the ones that require super-low latency, especially in financial trading. However, as I already noted in writing about StreamBase and Truviso, there are plenty of other CEP apps with less extreme latency requirements.

Commonly, these are data reduction apps – i.e., there’s a gushing stream of inputs, and the CEP engine filters and “enhances” it, so that only a small, modified subset is sent forward. In other cases, disk-based systems could do the job perfectly well from a performance standpoint, but the pattern matching and filtering requirements are just a better fit for the CEP paradigm.
Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.