Memory-centric data management

Analysis of technologies that manage data entirely or primarily in random-access memory (RAM). Related subjects include:

June 22, 2007

Memory-centric vs. conventional DBMS — a Solid difference

I had the chance to talk at length with Solid Information Technology tech guru Antoni Wolski about their memory-centric DBMS technology architecture. The most urgent topic was what made in-memory database managers inherently faster than disk-based ones that happened to have all the data in cache. But we didn’t really separate that subject from the general topic of how they made their memory-centric technology run fast, from its introduction in 2002 through substantial upgrades in the most recent release.

There were 4 main subtopics to the call:

1. Indexing structures that are very different from those of disk-based DBMS.
2. Optimizations to those indexing structures.
3. Optimizations to logging and checkpointing.
4. Miscellaneous architectural issues.
Read more

June 20, 2007

SolidDB caching for DB2

It’s just at the proof-of-concept stage, but Solid has a nice write-up about SolidDB being used as a front-end cache for DB2. Well, it’s a marketing document, so of course there’s a lot of pabulum too, but interspersed there’s some real meat as well. Highlights include 40X throughput improvement and 1 millisecond average response time (something that clearly can’t be achieved with disk-centric technology alone).

Analogies to Oracle/TimesTen are probably not coincidental; this is exactly the upside scenario for the TimesTen acquisition, as well as being TimesTen’s biggest growth area towards the end of its stint as an independent company.

June 18, 2007

More on stream processing integration with disk-based DBMS

Mike Stonebraker wrote in with one “nit pick” about yesterday’s blog. I had credited Truviso for strong DBMS/stream processor integration. He shot back that StreamBase has Sleepycat integrated in-process. He further pointed out that a Sleepycat record lookup takes only 5 microseconds if the data is in cache. Assuming what he means is that it’s in Sleepycat’s cache, that would be tight integration indeed.

I wonder whether StreamBase will indefinitely rely on Sleepycat, which is of course now an Oracle product …

June 18, 2007

Mike Stonebraker on financial stream processing

After my call with Truviso and blog post referencing same, I had the chance to discuss stream processing with Mike Stonebraker, who among his many other distinctions is also StreamBase’s Founder/CTO. We focused almost exclusively on the financial trading market. Here are some of the highlights. Read more

June 12, 2007

Thoughts on database management in role-playing games

I’ve just started a research project on the IT-like technology of games and virtual worlds, especially MMORPGs. My three recent posts on Guild Wars attracted considerable attention in GW’s community, and elicited some interesting commentary, especially for the revelation of Guild Wars’ very simple database architecture. Specifically, pretty much all character information is banged into a BLOB or two, and stored as a string of tokens, with little of the record-level detail one might expect. By way of contrast, Everquest is run on Oracle (and being transitioned to EnterpriseDB), at least one console-based game maker uses StreamBase, and so on.

Much of the attention has focused on the implications for the in-game economy – how can players buy and sell to their hearts’ content if there’s no transactional back-end. Frankly, I think that’s the least of the issues. For one thing, without a nice forms-based UI you probably won’t create enough transactions to matter, and integrating that into the game client isn’t trivial. For another, virtual items can be literally created and destroyed by the computer, with no negative effect on game play, a factor which drastically reduces the integrity burdens the game otherwise would face.

Rather, where I think the Guild Wars developers at ArenaNet may be greatly missing out is in the areas of business intelligence, data mining, and associated game control. Here are some examples of analyses they surely would find it helpful to do. Read more

June 7, 2007

StreamBase and Truviso

StreamBase is a decently-established startup, possibly the largest company in its area. Truviso, in the process of changing its name from Amalgamated Insight, has a dozen employees, one referenceable customer, and a product not yet in general availability. Both have ambitious plans for conquering the world, based on similar stories. And the stories make a considerable amount of sense.

Both companies’ core product is a memory-centric SQL engine designed to execute queries without ever writing data to disk. Of course, they both have persistence stories too — Truviso by being tightly integrated into open-source PostgreSQL, StreamBase more via “yeah, we can hand the data off to a conventional DBMS.” But the basic idea is to route data through a whole lot of different in-memory filters, to see what queries it satisfies, rather than executing many queries in sequence against disk-based data. Read more

April 11, 2007

ANTs Software is finally making some sense

ANTs Software is in essence a “public venture capital” outfit, with over $100 million in market capitalization and negligible revenue. It also features some interesting ideas in OLTP data management, a new management team (as of last year), and a new strategy. ANTs’ new strategy, in my opinion, stands a better chance of success than its predecessor, which in essence was to tell large enterprises “Throw out Oracle and use ANTs DB instead for your most mission-critical OLTP apps, because it’s faster, cheaper, and compatible.”

There actually are two prongs to ANTs’ new strategy. One of them, however, is a Big Secret that the company adamantly insists I not write about, notwithstanding that it is pretty much spelled out in this press release. The other is high-performance OLTP for specialized apps, in defense, telecom, financial trading, etc. The best way to summarize what “high-performance” means is this: When I asked what the technical sweet spot for ANTs DB, Engineering VP Rao Yendluri said “Half a million updates per second.” Read more

March 25, 2007

Oracle, Tangosol, objects, caching, and disruption

Oracle made a slick move in picking up Tangosol, a leader in object/data caching for all sorts of major OLTP apps. They do financial trading, telecom operations, big web sites (Fedex, Geico), and other good stuff. This is a reminder that the list of important memory-centric data handling technologies is getting fairly long, including:

And that’s just for OLTP; there’s a whole other set of memory-centric technologies for analytics as well.

When one connects the dots, I think three major points jump out:

  1. There’s a lot more to high-end OLTP than relational database management.
  2. Oracle is determined to be the leader in as many of those areas as possible.
  3. This all fits the market disruption narrative.

I write about Point #1 all the time. So this time around let me expand a little more on #2 and #3.
Read more

March 24, 2007

Will database compression change the hardware game?

I’ve recently made a lot of posts about database compression. 3X or more compression is rapidly becoming standard; 5X+ is coming soon as processor power increases; 10X or more is not unrealistic. True, this applies mainly to data warehouses, but that’s where the big database growth is happening. And new kinds of data — geospatial, telemetry, document, video, whatever — are highly compressible as well.

This trend suggests a few interesting possibilities for hardware, semiconductors, and storage.

  1. The growth in demand for storage might actually slow. That said, I frankly think it’s more likely that Parkinson’s Law of Data will continue to hold: Data expands to fill the space available. E.g., video and other media have near-infinite potential to consume storage; it’s just a question of resolution and fidelity.
  2. Solid-state (aka semiconductor or flash) persistent storage might become practical sooner than we think. If you really can fit a terabyte of data onto 100 gigs of flash, that’s a pretty affordable alternative. And by the way — if that happens, a lot of what I’ve been saying about random vs. sequential reads might be irrelevant.
  3. Similarly, memory-centric data management is more affordable when compression is aggressive. That’s a key point of schemes such as SAP’s or QlikTech’s. Who needs flash? Just put it in RAM, persisting it to disk just for backup.
  4. There’s a use for faster processors. Compression isn’t free. What you save on disk space and I/O you pay for at the CPU level. Those 5X+ compression levels do depend on faster processors, at least for the row store vendors.
February 13, 2007

QlikTech – flexible, memory-centric, columnar BI

QlikTech has a pretty interesting story, and a number of customers seem to agree. Their flagship product QlikView is a BI suite that runs off an in-memory copy of the data. Specifically, that copy is logically relational and physically columnar. In an important feature, QlikView is happy to import data from multiple sources at once, such as a warehouse plus an operational data store.

So the QlikTech pitch is essentially “Buy our stuff, and you can start doing BI immediately, running any queries and reports you want to. No reason to limit your queries to any kind of dimensional model. No need to prepare the data.” More precisely, QlikTech claims to do away with some kinds of data preparation; obviously, cleaning and so on might still be necessary. Indeed, they describe their classic use case as being the combination of data partly from an operational store and partly from a pre-existing warehouse. Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.