Archiving and information preservation
Analysis of technologies related to database archiving and information preservation. Related subjects include:
Notes on the Oracle Database 11g Release 2 white paper
The Oracle Database 11g Release 2 white paper I cited a couple of weeks ago has evidently been edited, given that a phrase I quoted last month is no longer to be found. Anyhow, here are some quotes from and comments on what evidently is the latest version. Read more
Merv Adrian on SAND Technology
Merv Adrian blogged about SAND Technology, casting significant doubt on SAND’s business prospects. At this point, I can’t say I disagree. On the other hand, SAND does have public, audited financial statements showing it generating more revenue than a lot of other analytic DBMS or archiving vendors probably make. Columnar DBMS vendors doing better than SAND are Sybase, Vertica, maybe Infobright — and who else?
Categories: Archiving and information preservation, Columnar database management, Data warehousing, SAND Technology | 1 Comment |
The secret sauce to Clearpace’s compression
In an introduction to archiving vendor Clearpace last December, I noted that Clearpace claimed huge compression successes for its NParchive product (Clearpace likes to use a figure of 40X), but didn’t give much reason that NParchive could compress a lot more effectively than other columnar DBMS. Let me now follow up on that.
To the extent there’s a Clearpace secret sauce, it seems to lie in NParchive’s unusual data access method. NParchive doesn’t just tokenize the values in individual columns; it tokenizes multi-column fragments of rows. Which particular columns to group together in that way seems to be decided automagically; the obvious guess is that this is based on estimates of the cardinality of their Cartesian products.
Of the top of my head, examples for which this strategy might be particularly successful include:
- Denormalized databases
- Message stores with lots of header information
- Addresses
Categories: Archiving and information preservation, Columnar database management, Database compression, Rainstor | 8 Comments |
Database archiving and information preservation
Two similar companies reached out to me recently – SAND Technology and Clearpace. Their current market focus is somewhat different: Clearpace talks mainly of archiving, and sells first and foremost into the compliance market, while SAND has the most traction providing “near-line” storage for SAP databases.* But both stories boil down to pretty much the same thing: Cheap, trustworthy data storage with good-enough query capabilities. E.g., I think both companies would agree the following is a not-too-misleading first-approximation characterization of their respective products:
- Fully functional relational DBMS.
- Claims of fast query performance, but that’s not how they’re sold.
- Huge compression.
- Careful attention to time-stamping and auditability.
Categories: Archiving and information preservation, Database compression, Rainstor, SAND Technology | 3 Comments |
Introduction to Clearpace
Clearpace is a UK-based startup in a similar market to what SAND Technology has gotten into – DBMS archiving, with a strong focus on compression and general cost-effectiveness. Clearpace launched its product NParchive a couple of quarters ago, and says it now has 25 people and $1 million or so in revenue. Clearpace NParchive technical highlights include: Read more
Categories: Archiving and information preservation, Rainstor | 1 Comment |
Introduction to SAND Technology
SAND Technology has a confused history. For example:
- SAND has been around in some form or other since 1982, starting out as a Hitachi reseller in Canada.
- In 1992 SAND acquired a columnar DBMS product called Nucleus, which originally was integrated with hardware (in the form of a card). Notwithstanding what development chief Richard Grodin views as various advantages vs. Sybase IQ, SAND has only had limited success in that market.
- Thus, SAND introduced a second, similarly-named product, which could also be viewed as a columnar DBMS. (As best I can tell, both are called SAND/DNA.) But it’s actually focused on archiving, aka the clunkily named “near-line storage.” And it’s evidently not the same code line; e.g., the newer product isn’t bit-mapped, while the older one is.
- The near-line product was originally focused on the SAP market. Now it’s moving beyond.
- Canada-based SAND had offices in Germany and the UK before it did in the US. This leads to an oddity – SAND is less focused on the SAP aftermarket in Germany than it still is in the US.
SAND is publicly traded, so its numbers are on display. It turns out to be doing $7 million in annual revenue, and losing money.
OK. I just wanted to get all that out of the way. My main thoughts about the DBMS archiving market are in a separate post.