September 20, 2006
Netezza vs. conventional data warehousing RDBMS
For various reasons, I’m not going to try to give a comprehensive overview of the Netezza story. But I’d like to highlight four points that illustrate a lot of the difference between Netezza’s architecture and that of more conventional data warehousing DBMS.
- It’s all about sequential access. Netezza data is stored in “extents” 3 megabytes in size. DATallegro does something similar.
- There is very little indexing in Netezza systems. Indeed, they say 98-99% of processing is via hash joins. Much the same is true of DATallegro.
- Netezza’s idea of “materialized views” is much more limited than the state of the art. Netezza has something it calls a “materialized view,” but that’s only a restriction/projection of a single table. No pre-joins, no aggregates. They’re confident they can outperform conventional systems without those aids, and they want to keep their database structures SIMPLE.
- Netezza’s substitute for range partitioning is very simple. Netezza features “zone maps,” which note the minimum and maximum of each column value (if such concepts are meaningful) in each extent. This can amount to effective range partitioning over dates; if data is added over time, there’s a good chance that the data in any particular date range is clustered, and a zone map lets you pick out which data falls in the desired data range. But that seems to be the primary scenario in which zone maps confer a large benefit.
Categories: Data warehouse appliances, Data warehousing, DATAllegro, Netezza
Subscribe to our complete feed!
Comments
6 Responses to “Netezza vs. conventional data warehousing RDBMS”
Leave a Reply
[…] In addition to its software story, Netezza of course has a rather unique chip story. Where other vendors might have standard disk controllers and high-powered microprocessors, Netezza respectively has a FPGA (Field-Programmable Gate Array) and lesser microprocessor (PowerPC). Netezza claims that the advantages of these choices are: […]
[…] Netezza has no indices at all. And no caches. And the hardware is preconfigured. This all makes administration pretty simple. […]
[…] information is stored in data pack nodes,* one per data pack. If you’re familiar with Netezza zone maps, data pack nodes sound like zone maps on steroids. They store maximum values, minimum values, and […]
[…] back in 2006, I wrote about a cool Netezza feature called the zone map, which in essence allows you to do partition elimination even in the absence of strict range […]
This page definitely has all of the information I needed about this subject and didn’t know who to ask.
Netezza vs. conventional data warehousing RDBMS
Netezza vs. conventional data warehousing RDBMS | DBMS 2 : DataBase Management System Services