SAP AG

Analysis of SAP AG, and most especially its memory-centric BI Accelerator technology. Also covered are SAP’s overall database, connectivity, and analytics strategies. Related subjects include:

March 24, 2007

Will database compression change the hardware game?

I’ve recently made a lot of posts about database compression. 3X or more compression is rapidly becoming standard; 5X+ is coming soon as processor power increases; 10X or more is not unrealistic. True, this applies mainly to data warehouses, but that’s where the big database growth is happening. And new kinds of data — geospatial, telemetry, document, video, whatever — are highly compressible as well.

This trend suggests a few interesting possibilities for hardware, semiconductors, and storage.

  1. The growth in demand for storage might actually slow. That said, I frankly think it’s more likely that Parkinson’s Law of Data will continue to hold: Data expands to fill the space available. E.g., video and other media have near-infinite potential to consume storage; it’s just a question of resolution and fidelity.
  2. Solid-state (aka semiconductor or flash) persistent storage might become practical sooner than we think. If you really can fit a terabyte of data onto 100 gigs of flash, that’s a pretty affordable alternative. And by the way — if that happens, a lot of what I’ve been saying about random vs. sequential reads might be irrelevant.
  3. Similarly, memory-centric data management is more affordable when compression is aggressive. That’s a key point of schemes such as SAP’s or QlikTech’s. Who needs flash? Just put it in RAM, persisting it to disk just for backup.
  4. There’s a use for faster processors. Compression isn’t free. What you save on disk space and I/O you pay for at the CPU level. Those 5X+ compression levels do depend on faster processors, at least for the row store vendors.
March 16, 2007

Word of the day: “Compression”

IBM sent over a bunch of success stories recently, with DB2’s new aggressive compression prominently mentioned. Mike Stonebraker made a big point of Vertica’s compression when last we talked; other column-oriented data warehouse/mart software vendors (e.g. Kognitio, SAP, Sybase) get strong compression benefits as well. Other data warehouse/mart specialists are doing a lot with compression too, although some of that is governed by please-don’t-say-anything-good-about-us NDA agreements.

Compression is important for at least three reasons:

When evaluating data warehouse/mart software, take a look at the vendor’s compression story. It’s important stuff.

EDIT: DATAllegro claims in a note to me that they get 3-4x storage savings via compression. They also make the observation that fewer disks ==> fewer disk failures, and spin that — as it were 🙂 — into a claim of greater reliability.

February 13, 2007

QlikTech – flexible, memory-centric, columnar BI

QlikTech has a pretty interesting story, and a number of customers seem to agree. Their flagship product QlikView is a BI suite that runs off an in-memory copy of the data. Specifically, that copy is logically relational and physically columnar. In an important feature, QlikView is happy to import data from multiple sources at once, such as a warehouse plus an operational data store.

So the QlikTech pitch is essentially “Buy our stuff, and you can start doing BI immediately, running any queries and reports you want to. No reason to limit your queries to any kind of dimensional model. No need to prepare the data.” More precisely, QlikTech claims to do away with some kinds of data preparation; obviously, cleaning and so on might still be necessary. Indeed, they describe their classic use case as being the combination of data partly from an operational store and partly from a pre-existing warehouse. Read more

January 22, 2007

Who’s who in columnar relational database management systems

The best known columnar RDBMS is surely Sybase’s IQ Accelerator, evolved from a product acquired in the mid-1990s. Problem – it doesn’t have a shared-nothing architecture of the sort needed to exploit grid/blade technology. Whoops. The other recognized player is SAND, but I don’t know a lot about them. Based on their website, it would seem that grids and compression play a big part in their story. Less established but pretty interesting is Kognitio, who are just beginning to make marketing noise outside the UK. SAP’s BI Accelerator is also a compressed columnar system, but operates entirely in-memory and hence is limited in possible database size. Mike Stonebraker’s startup Vertica is of course the new kid on the block, and there are other columnar startups as well whose names currently escape me.

Read more

September 22, 2006

Competitive issues in data warehouse ease of administration

The last person I spoke with at the Netezza conference on Tuesday was a customer/presenter that the company had picked out for me. One thing he said baffled me — he claimed that Netezza was a real appliance vendor, but DATallegro wasn’t, presumably due to administrability issues. Now, it wasn’t clear to me that he’d ever evaluated DATallegro, so I didn’t take this too seriously, but still the exchange brought into focus the great differences between data warehouse products in the area of administration. For example:

September 20, 2006

SAP’s BI Accelerator

I wrote about SAP’s BI Accelerator quite a bit in my white paper on memory-centric data management, but otherwise I seem not to have posted much about it here. In essence, it’s a product that’s all RAM-based, and generally geared for multi-hundred-gigabyte data marts. The basic design is a compression-heavy column-based architecture, evolved from SAP’s text-indexing technology TREX. Like data warehouse appliances, it eschews indexing, relying instead on blazingly fast table scans.

I asked Lothar Schubert of SAP how BIA was doing in the market in its early going. This was his response:

Read more

September 19, 2006

Is data warehousing now all about sequential access?

A lot of evidence is pointing to a major paradigm shift in data warehouse RDBMS, along the lines of:

Old way: Assume I/O is random; lower total execution time by improving selectivity and thus lowering the amount of I/O.

New way: Drive the amount of random I/O to near zero, and do as much sequential I/O as necessary to achieve this goal.

Examples include:

Read more

August 10, 2006

QlikView – a leader in memory-centric BI

QlikTech — the vendor of QlikView — contacted me to tell their memory-centric BI story. A Swedish company with >$23 million in estimated license revenue last year and a 100%ish growth rate, they claim to be the leader in that space, pulling ahead of Applix. But for now, I’ll call them “a” leader, and say that their story sounds like a hybrid between those of Applix (TM1 product) and SAP (BI Accelerator).

Read more

August 8, 2006

ANTs’ memory-centric characteristics to the fore?

An eWeek article suggests that ANTs is repositioning with a strong emphasis on memory-centricity. ANTs’ website, frankly, doesn’t support this theory, giving a more balanced tech overview in line with how they pitched me in a briefing last November. Still, it’s an interesting possibility to watch.

The main focus of the article actually wasn’t ANTs, but rather SAP’s wildest dreams in expanding the scope of its BI Accelerator technology. But the new-to-me part was the positioning of ANTs.

May 22, 2006

Data warehouse appliances

If we define a “data warehouse appliance” as “a special-purpose computer system, with appliance administratibility, that manages a data warehouse,” then there are two major contenders: Netezza and DATAllegro, both startups, both with a small number of disclosed customers. Past contenders would include Teradata and White Cross (which seems to have just merged into Kognitio), but neither would admit to being in that market today. (I suspect this is a mistake on Teradata’s part, but so be it.) IBM with DB2 on the z-Series wouldn’t be properly regarded as an appliance player either, although IBM is certainly conscious of appliance competition. And SAP’s BI Accelerator does not persist data at this time.

In principle, the Netezza and DATAllegro stories are similar — take an established open source RDBMS*, build optimized hardware to run it, and optimize the software configuration as well. Much of the optimization is focused on getting data on and off disk sequentially, minimizing any random accesses. This is why I often refer to data warehouse appliances as being the best alternative to memory-centric data management. Beyond that, the optimizations by the two vendors differ considerably.
*Netezza uses PostgreSQL; DATAllegro uses Ingres.

Hmm. I don’t feel like writing more on this subject at this very moment, yet I want to post something urgently because there’s an IOU in my Computerworld column today for it. OK. More later.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.