Storage
Analysis of storage technologies, especially in the context of database management. Related subjects include:
ParAccel PADB technical notes
I posted last October about PADB (ParAccel Analytic DataBase), but held back on various topics since PADB 3.0 was still under NDA. By the time PADB 3.0 was released, I was on blogging hiatus. Let’s do a bit of ParAccel catch-up now.
One big part of PADB 3.0 was an analytics extensibility framework. If we match PADB against my recent analytic computing system checklist, Read more
Categories: Analytic technologies, Data warehousing, EMC, MapReduce, ParAccel, Parallelization, Storage | 2 Comments |
Schooner — flash-based, now software-only, and very fast
Last October I wrote about Schooner Information Technology, which made flash-based appliances, for MySQL, memcached, or persistent memcached. Schooner sold those appliances to close to 20 customers, but even so decided software-only was a better way to go.
Schooner’s core value proposition is that one Schooner box with flash does the job of a lot of MySQL or NoSQL boxes with hard drives. Highlights of the Schooner story — of which you can find more detail at the Schooner website — now include: Read more
Categories: Clustering, memcached, MySQL, OLTP, Schooner Information Technology, Solid-state memory | 4 Comments |
Architectural options for analytic database management systems
Mike Stonebraker recently kicked off some discussion about desirable architectural features of a columnar analytic DBMS. Let’s expand the conversation to cover desirable architectural characteristics of analytic DBMS in general. Read more
Introduction to Kaminario
At its core, the Kaminario story is simple:
- Throw out your disks and replace them with, not Flash, but actual DRAM.
-
Your IOPS (Input/Output Per Second) are so high* that you get the performance you need without any further system changes.
- The whole thing is very fast to set up.
In other words, Kaminario pitches a value proposition something like (my words, not theirs) “A shortcut around your performance bottlenecks.”
*1 million or so on the smallest Kaminario K2 appliance.
Kaminario asserts that both analytics and OLTP (OnLine Transaction Processing) are represented in its user base. Even so, the use cases Kaminario mentioned seemed to be concentrated on the analytic side. I suspect there are two main reasons:
- As Kaminario points out, OLTP apps commonly are designed to perform in the face of regrettable I/O wait.
- Also, analytic performance problems tend to arise more suddenly than OLTP ones do.*
*Somebody can think up a new analytic query overnight that takes 10 times the processing of anything they’ve ever run before. Or they can get the urge to run the same queries 10 times as often as before. Both those kinds of thing happen less often in the OLTP world.
Accordingly, Kaminario likes to sell against the alternative of getting a better analytic DBMS, stressing that you can get a Kaminario K2 appliance into production a lot faster than you can move your processing to even the simplest data warehouse appliance. Kaminario is probably technically correct in saying that; even so, I suspect it would often make more sense to view Kaminario K2 appliances as a transition technology, by which I mean:
- You have an annoying performance problem.
- Kaminario K2 could solve it very quickly.
- That buys you time for a more substantive fix.*
- If you want, you can redeploy your Kaminario K2 storage to solve your next-worst performance bottleneck.
On that basis, I could see Kaminario-like devices eventually getting to the point that every sufficiently large enterprise should have some of them, whether or not that enterprise has an application it believes should run permanently against DRAM block storage. Read more
Categories: Investment research and trading, Kaminario, Solid-state memory, Storage, Telecommunications, Web analytics | 7 Comments |
Where ParAccel is at
Until recently, I was extremely critical of ParAccel’s marketing. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that’s now happening. On my recent California trip, I chatted with three ParAccel folks for a few hours. Based on that and other conversation, here’s the current ParAccel story as I understand it.
Read more
Notes on the EMC Greenplum Data Computing Appliance
The big confidential part of my visit last week to EMC’s Data Computing Division, nee’ Greenplum, was of course this week’s announcement of the first EMC/Greenplum “Data Computing Appliance.” Basics include: Read more
Categories: Analytic technologies, Data warehousing, EMC, Exadata, Greenplum, Oracle, Parallelization, Storage | 1 Comment |
Quick introduction to Schooner Information Technology appliances
Back in August I talked with John Busch of Schooner Information Technology, which has a non-obvious URL. Schooner Information Technology sells Flash-based appliances that are mainly intended to run MySQL with blazing write performance.
This is one of those cases in which I warned that due to my September wave of family health issues I would cut a few blogging corners, so:
- I’m only going to write about the MySQL aspect, even though Schooner has a memcached product and claims to be able to run other NoSQL stuff as well.
- I’m not going to dig for company information beyond recalling:
- Schooner said that it has invested $20 million in R&D.
- Schooner’s appliances are resold by IBM.
- Schooner also has a direct sales force.
- One flagship customer had 30 TB of data on 17 Schooner nodes.
If Schooner wants to add some of what I’ve left out into the comments to this post, that would be great.
Schooner appliances are meant to be clustered, Read more
Categories: memcached, MySQL, OLTP, Parallelization, Schooner Information Technology, Solid-state memory | 4 Comments |
Notes and links October 3 2010
Some notes, follow-up, and links before I head out to California: Read more
Categories: GIS and geospatial, Google, HP and Neoview, Humor, Kickfire, Netezza, Solid-state memory, Teradata, Web analytics | 3 Comments |
Some thoughts on the announcement that IBM is buying Netezza
As you’ve probably read, IBM and Netezza announced a deal today for IBM to buy Netezza. I didn’t sit in on the conference call, but I’ve seen the reporting. Naturally, I have some quick thoughts, which I’ve broken up into several sections below:
- Clearing some underbrush.
- Speculation about what IBM/Netezza will do.
- Speculation about alternative acquirers for Netezza.
- Speculation about what IBM/Netezza competitors will do.
Vertica’s innovative architecture for flash, plus more about temp space than you perhaps wanted to know
Vertica is announcing:
- Technology it already has released*, but has not published any reference architectures for.
- A Barney partnership.**
In other words, Vertica has succumbed to the common delusion that it’s a good idea to put out half-baked press releases the week of TDWI conferences. But if we look past that kind of all-too-common nonsense, Vertica is highlighting an interesting technical story, about how the analytic DBMS industry can exploit solid-state memory technology.
*Upgrades to Vertica FlexStore to handle flash memory, actually released as part of Vertica 4.0
** With Fusion I/O
To set the context, let’s recall a few points I’ve noted in the past:
- Solid-state memory’s price/throughput tradeoffs obviously make it the future of database storage.
- The flash future is coming soon, in part because flash’s propensity to wear out is overstated. This is especially true in the case of modern analytic DBMS, which tend to write to blocks all at once, and most particularly the case for append-only systems such as Vertica.
- Being able to intelligently split databases among various cost tiers of storage – e.g. flash and disk – makes a whole lot of sense.
Taken together, those points tell us:
For optimal price/performance, analytic DBMS should support databases that run part on flash, part on disk.
While all this is a future for some other analytic DBMS vendors, Vertica is shipping it today.* What’s more, three aspects of Vertica’s architecture make it particularly well-suited for hybrid flash/disk storage, in each case for a similar reason – you can get most of the performance benefit of all-flash for a relatively low actual investment in flash chips: Read more