September 22, 2006
Competitive issues in data warehouse ease of administration
The last person I spoke with at the Netezza conference on Tuesday was a customer/presenter that the company had picked out for me. One thing he said baffled me — he claimed that Netezza was a real appliance vendor, but DATallegro wasn’t, presumably due to administrability issues. Now, it wasn’t clear to me that he’d ever evaluated DATallegro, so I didn’t take this too seriously, but still the exchange brought into focus the great differences between data warehouse products in the area of administration. For example:
- Netezza has no indices at all. And no caches. And the hardware is preconfigured. This all makes administration pretty simple.
- DATallegro has almost no indices, and also has preconfigured hardware. But it has some partitioning, optionally.
- Teradata also has preconfigured hardware. It does have indices, but rather simple ones. Plus it has join indices. And it has a few more configuration options in other areas (e.g., block size) than the other appliance vendors. (Yes, I count Teradata among the appliances.)
- If you go through all the fuss of installing SAP’s applications and BI technology anyway, the incremental administration of just SAP BI Accelerator is pretty light.
- Oracle and IBM have mammothly complex indexing options, but have put large amounts of work into tools to lessen the resulting administrative burden.
- IBM offers preconfigured hardware units to simplify some installation issues.
- Come to think of it, I don’t really know how hard it is to administer columnar systems (e.g., Sybase IQ).
Categories: Data warehouse appliances, Data warehousing, DATAllegro, Greenplum, IBM and DB2, Netezza, Oracle, SAP AG, Teradata
Subscribe to our complete feed!
Comments
3 Responses to “Competitive issues in data warehouse ease of administration”
Leave a Reply
This seems to be Netezza’s latest competitive response to DATAllegro – that we’re not a ‘real’ appliance. Here’s how we respond:
– Our appliance can be used in much the same way as Netezza – i.e. no indices and just date range partitioning within the nodes. Netezza doesn’t have true range partitioning, but zone maps perform the same function as single-level range partitions for data that is in date order. DBA effort will be very similar, but performance at a given price point or data capacity will generally be better on our appliance.
– If a customer wants even better performance, they can easily add multi-level partitioning with a day or two of physical design work. Some queries will run 10-50 times faster.
– VERY occasionally, it’s useful to add one or two indices for queries that just look up one or two rows. This is very easy for any experienced DBA. This is just not an option in Netezza.
– Our caching does not add ANY administration effort.
Which product is more of an appliance is pretty irrelevant IMHO. Data warehouse platforms will never be as simple as household appliances such as microwave ovens. The question is whether they are good enough to do the job. A microwave without a timer would still work, but it wouldn’t be very convenient. We feel that Netezza’s architecture is just too simplistic for most real-world projects.
Stuart
DATAllegro
WX2 from Kognitio takes ease of use even further.
Like Netezza and DATAllegro WX2 does not use any indices. but it also does not require any pre-partitioning of
the data. Data is randomly distributed across all nodes in the system. At data load, this process is
completely automatic, no input is required from the DBA.
When a query is executed that requires data to co-exist on a node other than the one it was loaded on, WX2
will automatically re-distribute only the parts of the data required to satisfy the query, into memory,
in such a way that the data is now correctly co-located.
Because the re-distribution is to memory not disk this process is very fast and all subsequent database
operations are performed “in-memory”.
WX2 uses very sophisticated data streaming techniques to ensure that queries do not run out of memory no
matter how much data is involved.
Roger Gaskell
Kognitio
[…] folks can probably manage departmental analytic RDBMS if they need to (that was one of Netezza’s early value propositions), but a Hadoop cluster stretches them. So easy deployment and administration stories — e.g. […]