November 15, 2008

High-performance analytics

For the past few months, I’ve collected a lot of data points to the effect that high-performance analytics – i.e., beyond straightforward query — is becoming increasingly important. And I’ve written about some of them at length. For example:

Ack. I can’t decide whether “analytics” should be a singular or plural noun. Thoughts?

Another area that’s come up which I haven‘t blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including: Read more

October 17, 2008

Introduction to Talend

I didn’t spend much time on the show floor at Teradata Partners, but I did connect with Yves de Montcheuil of Talend for a couple of little chats.  Highlights of the Talend story include: Read more

October 10, 2008

Multitenancy hype is getting out of control

I posted recently on SaaS-data-integration-in-the-cloud, and a couple of vendors stopped by the comment thread to shared what they do. One was Boomi, which has a blog that does a good job of spelling out its opinions. What the Boomi blog is not so good at, however, is giving any good reasons why one should share those opinions.

I refer specifically to a couple of posts claiming that multitenancy is somehow crucial for SaaS data integration to work. To this I can only say — huh? A decent data integration system should be able to handle many parallel threads at once, connecting many pairs of databases at once. So the hard part of multitenancy is pretty much “free.” If, even so, the integration provider chooses not to go fully multitenant, whose business is it but theirs? Read more

October 9, 2008

Everybody’s putting integration services in the cloud

Both Pervasive Software and Cast Iron Systems told me recently of fairly pure cloud offerings. In this, they’re joining Informatica, which started offering Salesforce.com integration-as-a-service back in 2006. So far as I can tell, the three vendors are doing somewhat different things. Read more

October 5, 2008

Schema flexibility and XML data management

Conor O’Mahony, marketing manager for IBM’s DB2 pureXML, talks a lot about one of my favorite hobbyhorses — schema flexibility — as a reason to use an XML data model. In a number of industries he sees use cases based around ongoing change in the information being managed:

Conor also thinks market evidence shows that XML’s schema flexibility is important for data interchange. Read more

October 5, 2008

Vertical market XML standards

Tracking the alphabet soup of vertical market XML standards is hard. So as a starting point, I’m splitting a list I got from IBM into a standalone post.

Among the most important or successful IBM pureXMLsupported standards, in terms of downloads and other evidence of customer interest, are: Read more

October 5, 2008

Overview of IBM DB2 pureXML

On August 29, I had a great call with IBM about DB2 pureXML (most of the IBM side of the talking was done by Conor O’Mahony and Qi Jin). I’m finally getting around to writing it up now. (The world of tabular data warehousing has kept me just a wee bit busy …)

As I write it, I see there are a considerable number of holes, but that’s the way it seems to go when researching XML storage. I’m also writing up a September call from which I finally figured out (I think) the essence of how MarkLogic Server works – but only after five months of trying. It turns out that MarkLogic works rather differently from DB2 pureXML. Not coincidentally, IBM and Mark Logic focus on rather different use cases for native XML storage.

What I understand so far about the basic DB2 pureXML architecture goes like this: Read more

August 26, 2008

Three approaches to parallelizing data transformation

Many MPP data warehousing vendors have told me their products are used for ELT (Extract/Load/Transform) instead of ETL (Extract/Transform/Load). I.e., needed data transformations are done on the MPP system, rather than on the — probably SMP — system the data comes from.* If the data transformation is being applied on a record-by-record basis, then it’s automatically fully parallelized. Even if the transforms are more complex, considerable parallel processing may still be going on.

*Or it’s some of each, at which point it’s called ETLT — I bet you can work out what that stands for.

Read more

July 8, 2008

Google has thousands of internal data formats, mostly simple ones

In connection with the release of Protocol Buffers, Kenton Varda of Google wrote: Read more

March 26, 2008

Pervasive is also pursuing simplicity and SaaS integration

I blogged recently about Cast Iron Systems, a simplicity-oriented data integration appliance vendor that is increasingly focusing on the SaaS market. Well, Pervasive Software is doing something similar.

Via Data Integrator, Pervasive is a leader in the low-cost integration market, with revenue split about 50/25/25 between direct sales, ISVs, and SaaS. Pervasive fondly believes that its products cost half as much as Cast Iron’s, and wind up taking no more installation effort when you factor in Pervasive’s broader capabilities in areas such as workflow. However, there’s some doubt as to whether this is apples-to-apples. Cast Iron does include hardware, after all, and as Pervasive itself points out, Cast Iron will bundle some professional services into a sale if you ask nicely.

Two things are new. Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.