Data warehouse appliances
Analysis of data warehouse appliances – i.e., of hardware/software bundles optimized for fast query and analysis of large volumes of (usually) relational data. Related subjects include:
- Data warehousing
- Parallelization
- Netezza
- DATAllegro
- Teradata
- Kickfire
- (in The Monash Report) Computing appliances in multiple domains
Oracle cites Exadata wins
A couple of weeks ago, Oracle put out a press release about Exadata wins. Highlights include:
- 20 names of actual customers.
- One quote citing a competitive win (over Netezza)
- One quote citing a ~50X speedup of one query “without manual tuning”
- One quote citing consistent 10-72X query performance speedups
- One quote citing a speedup from “days” to “minutes”
Unless I missed it, none of the quotes implied Exadata was actually in production, and none compared hardware between the old/slow/production and Exadata/fast/test systems.
Categories: Data warehouse appliances, Data warehousing, Exadata, Market share and customer counts, Netezza, Oracle | Leave a Comment |
Correction to a recent quote
I’m quoted in a recent article around Aster’s appliance announcement as saying data warehouse appliances are more suitable for small workgroups of analysts crunching small amounts of data than they are for other uses.
But that’s not what I think at all.
I do think the ease-of-administration pitch for appliances makes them particularly well suited for users who want to scrape by without doing much database adminstration. This is especially appealing to departments or smaller enterprises. And the first/best scenario that comes to mind is indeed a small team of analysts, with good SQL skills but lightweight DBA experience, although Netezza has proved that many other kinds of users can find appliances appealing as well.
But that small team of analysts may maintain the largest database in the firm.
And by the way — notwithstanding the MySpace counterexample, most of Aster’s initial customers had <10 terabyte databases, and I think indeed <5 terabyte. The “frontline” pitch succeeded for Aster before (MySpace again aside) any better-big-data-crunching story did.
Categories: Analytic technologies, Aster Data, Data warehouse appliances, Data warehousing, Theory and architecture | Leave a Comment |
Xtreme Data readies a different kind of FPGA-based data warehouse appliance
Xtreme Data called me to talk about its plans in the data warehouse appliance business, almost all details of which are currently embargoed. Still, a few points may be worth noting ahead of more precise information, namely:
- Xtreme Data’s basic idea is to take a custom board and build a data warehouse appliance around it.
- An Xtreme Data board looks a lot like a conventional two-socket board, but has only one four-core CPU. In addition, it sports some FPGAs (Field-Programmable Gate Arrays).
- In the Xtreme Data appliance, the FPGAs will be used for core SQL processing, after the data is ingested via conventional I/O. This is different from Netezza’s approach to FPGA-based data warehouse appliances, in which the FPGA sits in the place of a disk controller and touches the data first, before passing it off to a more or less conventional CPU.
- While preparing entry into the data warehouse appliance business, Xtreme Data has sold its board to 150 other outfits, many quite impressive. Buyers seem to be FPGA users who previously had to craft their own custom boards. According to Xtreme Data, major uses by these customers include:
- Military/intelligence/digital signal processing.
- Military/intelligence/cybersecurity (a newish area for Xtreme Data)
- Bioinformatics/high-throughput gene sequencing (a “handful” of customers)
- Medical imaging
- More or less pure university research of various sorts (around 50 customers)
- … but not database management.
- Xtreme Data’s website has a non-obvious URL. 🙂
So far as I can tell, Xtreme Data’s 1.0 product will — like most other 1.0 analytic database management products — be focused on price/performance, without little or no positive differentiation in the way of features.
Categories: Data warehouse appliances, Data warehousing, Netezza, Theory and architecture, XtremeData | 6 Comments |
Aster Data enters the appliance game
Aster Data is rolling out a line of nCluster appliances today. Highlights include:
- Configurations ranging from 9 6.25 terabytes to 1 petabyte of user data. (Edit: Here’s the up-to-date data sheet.)
- A $50K “Express Edition” price for <1 terabyte of user data. Unfortunately, that’s the only stated price.
- The option of bundled MicroStrategy.
- “MapReduce” in the name, which suggests something about the positioning — i.e., enterprise decision support, rather than Aster’s usual web/”frontline” emphasis. (Edit: That also fits with Aster’s recent MapReduce-for-.NET announcement.) (Edit: Actual name is Aster MapReduce Data Warehouse Appliance.)
- Claims that because Aster runs effectively on cheaper, more truly “commodity” hardware than competitors, you get more hardware bang for the buck if you buy from Aster.
I don’t have a lot more to add right now, mainly because I wrote at some length about Aster’s non-appliance-specific, non-MapReduce technology and positioning a couple of weeks ago.
Categories: Analytic technologies, Aster Data, Business intelligence, Data warehouse appliances, Data warehousing, Database compression, MapReduce, Pricing | 16 Comments |
Two lessons from Dataupia’s troubles
I’ve been beating my head against the wall trying to convince startups of two well-established truisms:
- Experience consistently shows that the demand for transparency/emulation features isn’t as great as entrepreneurs hope.
- If a startup’s competitors sell directly to enterprises, an indirect sales strategy rarely succeeds.
Maybe one or the other will learn from Dataupia’s example.
Dataupia’s troubles are now confirmed
Todd Fin pointed me yesterday to an article by Wade Roush that confirmed in detail layoffs and other troubles at Dataupia. The article quotes Dataupia marketing VP Samantha Stone as saying Dataupia is down to 23 employees, and that some of the layoffs were in engineering. This is consistent with what I’d been hearing for a while, namely that other analytic DBMS vendors were seeing a flood of Dataupia resumes, especially technical ones.
The article goes on to discuss difficulties Dataupia has had in raising another round of financing. During Dataupia’s very long CEO search — which I kept hearing about from people who’d been approached for the job — it was obvious money wouldn’t come in until a CEO was found. But it seems that even with a new CEO, existing investors are reluctant to re-up without a new investor as well, and that new investment is slow in happening.
On the plus side, the article quotes Samantha as saying founder Foster Hinshaw is recovering well from his heart surgery.
Categories: Data warehouse appliances, Data warehousing, Dataupia, Emulation, transparency, portability | 3 Comments |
Netezza Q1 earning call transcript
I finally read the Netezza Q1 earnings call transcript, put out by Seeking Alpha. Highlights included:
- Netezza got 14 new-name accounts and 21 follow-on deals. Average sale in both groups was right around $1 million.
- The economy is tough, deals are slipping, and nobody knows for sure what will happen.
- Netezza’s main head-to-head competitors are Oracle and Teradata. Netezza claims good but not perfect win rates against each, but concedes that those vendors (especially Oracle) of course get other deals Netezza never sees.
- Netezza characterizes Teradata as offering its multiple product lines, trying to upsell many customers from cheaper to more expensive product lines, and being selectively aggressive about pricing. None of this is surprising to me.
- 80% of Netezza’s Q1 revenue, and perhaps even a higher fraction of new-name accounts, was in four vertical markets: “Digital media,” telecom, government, and financial services.
- Some time over the next few months, Netezza will give at least some more clarity about future products.
One tip for the Netezza folks, by the way, from this former stock analyst — you should never use the word “certainly” about a deal you haven’t closed yet. “Almost surely” could be OK, but “certainly” — well, it certainly was not the thing to say.
The future of data marts
Greenplum is announcing today a long-term vision, under the name Enterprise Data Cloud (EDC). Key observations around the concept — mixing mine and Greenplum’s together — include:
- Data marts aren’t just for performance (or price/performance). They also exist to give individual analysts or small teams control of their analytic destiny.
- Thus, it would be really cool if business users could have their own analytic “sandboxes” — virtual or physical analytic databases that they can manipulate without breaking anything else.
- In any case, business users want to analyze data when they want to analyze it. It is often unwise to ask business users to postpone analysis until after an enterprise data model can be extended to fully incorporate the new data they want to look at.
- Whether or not you agree with that, it’s an empirical fact that enterprises have many legacy data marts (or even, especially due to M&A, multiple legacy data warehouses). Similarly, it’s an empirical fact that many business users have the clout to order up new data marts as well.
- Consolidating data marts onto one common technological platform has important benefits.
In essence, Greenplum is pitching the story:
- Thesis: Enterprise Data Warehouses (EDWs)
- Antithesis: Data Warehouse Appliances
- Synthesis: Greenplum’s Enterprise Data Cloud vision
When put that starkly, it’s overstated, not least because
Specialized Analytic DBMS != Data Warehouse Appliance
But basically it makes sense, for two main reasons:
- Analysis is performed on all sorts of novel data, from sources far beyond an enterprise’s core transactions. This data neither has to fit nor particularly benefits from being tightly fitted into the core enterprise data model. Requiring it to do so is just an unnecessary and painful bureaucratic delay.
- On the other hand, consolidation can be a good idea even when systems don’t particularly interoperate. Data marts, which commonly do in part interoperate with central data stores, have all the more reason to be consolidated onto a central technology platform/stack.
Daniel Abadi on Kickfire and related subjects
Daniel Abadi has a new blog, whose first post centers around Kickfire. The money quote is (emphasis mine):
In order for me to get excited about Kickfire, I have to ignore Mike Stonebraker’s voice in my head telling me that DBMS hardware companies have been launched many times in the past are ALWAYS fail (the main reasoning is that Moore’s law allows for commodity hardware to catch up in performance, eventually making the proprietary hardware overpriced and irrelevant). But given that Moore’s law is transforming into increased parallelism rather than increased raw speed, maybe hardware DBMS companies can succeed now where they have failed in the past
Good point.
More generally, Abadi speculates about the market for MySQL-compatible data warehousing. My responses include:
- OF COURSE there are many MySQL users who need to move to a serious analytic DBMS.
- What’s less clear is whether there’s any big advantage to those users in remaining MySQL-compatible when they do move. I’m not sure what MySQL-specific syntax or optimizations they’d have that would be difficult to port to a non-MySQL system.
- It’s nice to see Abadi speaking well of Infobright and its technology.
- To say that Infobright went open source because it was “desperate” is overstated. That said, I don’t think Infobright was on track to prosper without going open source.
- While open source and MySQL go together, an appliance like Kickfire loses many (not all) of the benefits of open source.
- Calpont has indeed never disclosed a customer win. Any year now … (Just kidding, Vogel!)
- In general, seeing Abadi be so favorable toward Vertica competitors adds credibiity to the recent Hadoop vs. DBMS paper.
Anyhow, as previously noted, I’m a big Daniel Abadi fan. I look forward to seeing what else he posts in his blog, and am optimistic he’ll live up to or exceed its stated goals.
Categories: Calpont, Columnar database management, Data warehouse appliances, Data warehousing, DBMS product categories, Infobright, Kickfire, MySQL, Open source, Theory and architecture | 2 Comments |
Oracle’s hardware strategy
Larry Ellison stated clearly in an email interview with Reuters (links here and here) that Oracle intends to keep Sun’s hardware business and indeed intends to invest in the SPARC chip. Naturally, I have a few thoughts about this.
As Stephen O’Grady points out, Sun’s main strength lay in selling to the large enterprise market. Well, that’s Oracle’s overwhelming focus too. As I noted two years ago:
One Oracle response is to provide lots of add-on technologies for high-end customers, on the database and middle tiers alike. In app servers it’s done surprisingly well against BEA. It’s sold a lot of clustering. And it’s bought into and tried to popularize niche technologies like TimesTen and Tangosol’s.
This all makes perfect sense – it’s a great fit for Oracle’s best customers, and a way to get thousands of extra dollars per server from enterprises that may already have bought all-you-can-eat licenses to the Oracle DBMS. And being so sensible, it fits into the Clayton Christensen disruption story in two ways:
Oracle may be helpless against mid-tier competition, but it sure has the high-end core of its market locked up.
- As one type of technology is commoditized, value is created in other parts of the technology stack.
Oracle’s ongoing acquisition spree in system software, application software, and now hardware just supports that story. MySQL, embedded Java, and so on may be welcome to Oracle as yet more opportunities to tap additional markets — but Oracle’s emphasis is and surely will remain on the large enterprise market.
The next notable point may be found in Larry’s key quote: Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, HP and Neoview, IBM and DB2, Oracle | 8 Comments |