Interpreting the results of data warehouse proofs-of-concept (POCs)
When enterprises buy new brands of analytic DBMS, they almost always run proofs-of-concept (POCs) in the form of private benchmarks. The results are generally confidential, but that doesn’t keep a few stats from occasionally leaking out. As I noted recently, those leaks are problematic on multiple levels. For one thing, even if the results are to be taken as accurate and basically not-misleading, the way vendors describe them leaves a lot to be desired.
Here’s a concrete example to illustrate the point. One of my vendor clients sent over the stats from a recent POC, in which its data warehousing product was compared against a name-brand incumbent. 16 reports were run. The new product beat the old 16 out of 16 times. The lowest margin was a 1.8X speed-up, while the best was a whopping 335.5X.
My client helpfully took the “simple average” — i.e. the mean – of the 16 factors, and described this as an average 62X drubbing. But is that really fair? The median speed-up was only 17X. And in a figure I find more meaningful than either of those, the total reduction in execution time – assuming each of the reports was run the same number of times – was “just” 12 times. Edit: As per the comments below, another option is geometric mean speed-up. That turns out to be 19X in this case.
Now, 12X is a whopping speed-up, and this was a very successful POC for the challenger. But calling it 62X is just silly, and that was the point of my earlier post.
So how should POC numbers be weighted? Ideally, one could calculate a big weighted sum: “Our daily workload will be a lot like 2,000 copies each of Queries 1, 2, 3, 4, and 5; 300 copies each of BigQueries 6 and 7; 25 copies of MegaQuery 7; and a copy of DestructoQuery 8; all multiplied by a factor of 17.”
But to come up with reasonable projections, it is not enough to look at past usage. After all, if the price of Query 3 goes down by 5X, while the cost of Query 8 goes down by a factor of 50, the relative consumption of Queries 3 and 8 is apt to change significantly. That’s just the economics of supply and demand.
Bottom line: The more accurately you can predict future data warehouse use, the more confidently you can choose the analytic database technology that’s best for you.
Comments
7 Responses to “Interpreting the results of data warehouse proofs-of-concept (POCs)”
Leave a Reply
Just as a thought, the natural form of a “mean” for a set of things that are indicating factors/multiples would be a *geometric* mean, computed as…
————————–
/
M = n / f_1 * f_2 * … * f_n
\ /
+
This still suffers from the typical problems of a “mean,” but at least it’s suited to the thing being measured.
The need for artificial weighting goes away; this provides a relatively unbiased measure of the “midpoint” of the multipliers.
Of course the *real* improvement is to have a valuation metric tied to what you’re actually using the system for.
Thus, if you’ve got 15 tests, 14 of which found immaterial effects on their runtimes from the tests, and the 15th of which is Truly Essential to run in under 24 hours, where it doesn’t, now, then *really*, the evaluation of goodness properly ought to be almost totally based on that 15th report’s runtime.
Now, that’s TOTALLY context sensitive. If the DW tool spectacularly improves performance on that one report of yours, there is no reason to expect it to have any particular relevant effect on *my* workload.
What we’d like is some way to be able to generalize from performance on your workload to assert something about expected performance on my workload.
Unfortunately, I’m not sure there’s anything other than a POC that can really get at the low level factors that contribute to the actual behaviour. A geometric mean may be better than an arithmetic one by some small margin, but it seems to me you need *way* more parameters than one number/factor in order to do any generalization.
Great comment Chris; I couldn’t agree more. As a rule, Vertica uses geometric mean query time (gmqt) to calculate the “nX-times faster” summary (e.g., http://www.vertica.com/benchmarks displays that, although we drop the word “geometric” in the tables to save space for cosmetic reasons).
We’ll make an exception to this if a customer uses another metric (like ‘average query time’)–if that’s what mattered to the customer/evaluator, then who are we to override them.
IT’S ALL ABOUT SLAs…
Another metric we often see, and I think this is what REALLY matters to customers, is related to the service level agreement (SLA) the database application must meet.
It’s the % of queries that run under ‘n’ seconds (or some other unit of time, usually).
In a recent POC, Vertica outperformed a competitive database by 19x (gmqt)–that’s solid, but not very flashy from a marketing perspective.
But the gmqt comparison didn’t matter at all to the prospect. What did matter was that 100% of the queries were answered in under 10 seconds (their performance SLA) vs. 0% for the competitor.
We also see compression (Raw data volume : DB size) measured and compared quite often in POCs. It directly relates to cost of ownership over time.
To sum up, the value of the DBMS to the customer and the motivation to buy it is often based on SLAs (as it should be). Factoring in SLAs (e.g., % of queries that meet SLA) puts the POC results into a context business people will understand (and fund, hopefully).
my 2 cents…
Good comments!
I’ll update the post and spreadsheet w/ a geometric option as soon as I can.
Best,
CAM
Curt-
Are the numbers with the heading “Old running time” the numbers from the current production environment? I’m wondering if this is another case of comparing new product on new hardware with new vendor supervision/assistance to current product on old hardware w/o vendor assistance.
Are the units minutes or seconds?
Dominika,
1. I believe so.
2. Seconds.
CAM
[…] Published December 10, 2008 oracle I recently read an article by Curt Monash entitled Interpreting the results of data warehouse proofs-of-concept (POCs). Curt’s post touched on a topic that continually mystifies me. I’m not sure when the […]
[…] few weeks ago I read Curt Monash’s report on interpreting the results of data warehouse proofs-of-concept (POCs) and I have to say, I’m quite surprised that this topic hasn’t been covered more by […]