Netezza
Analysis of Netezza and its data warehouse appliances. Related subjects include:
The layered messaging marketing model as applied to Netezza
I just put up a post claiming that enterprise IT marketing arguments commonly boil down to one of two layered messaging templates. Let’s test how that claim applies to one of the most innovative technology companies of this decade: Netezza.
Categories: Netezza | 2 Comments |
More data on data warehouse sizes and issues
I spoke today with Paul Barth and Randy Bean of consultancy NewVantage Partners. The core of NewVantage’s business seems to be helping large enterprises (especially financial services) with their data warehouse strategies. Takeaways — none of which should shock regular readers of DBMS2 — included:
- Administrative cost and difficulty are often the single biggest issue in selecting analytic DBMS products.
- Oracle hits a wall around 10 terabytes of user data. The one customer NewVantage can think of with an Oracle data warehouse over 10 terabytes is fleeing Oracle for Netezza.
- NewVantage says that very specialized data warehouses on Oracle could conceivably be larger than that.
- NewVantage does have a customer on DB2/UDB in the 30-40 terabyte range. That customer does a lot of careful tuning to make it work.
- About 15% of NewVantage’s customers use Netezza. Few if any use newer analytic DBMS (but I got the sense more will soon). The rest rely on “traditional” DBMS, a group that includes Teradata.
Categories: Data warehousing, IBM and DB2, Netezza, Oracle | 1 Comment |
Enterprises are buying multiple brands of analytic DBMS each
Over the past few weeks I’ve had a lot of NDA discussions about analytic DBMS vendors’ specific customers. And so I’ve been acutely aware of something I already sort of knew — just as there was in prior generations of database management technology, there’s huge overlap among analytic DBMS vendors’ customer bases as well. As they always have, enterprises are investing in multiple different brands of DBMS, even in cases where those DBMS can do pretty much the same things.
For example:
- Many Teradata users are buying newer technology too. But they aren’t actually throwing out Teradata.
- The same sometimes applies to Netezza already. At least two Netezza references are also references for a rival vendor.
- One outfit is among the biggest customers for two different analytic DBMS vendors, neither of which is Teradata or Netezza.
- One corporation is using or deploying four different brands of analytic DBMS.
- TEOCO is a big user of both DATAllegro and Netezza.
My current customer list among the data warehouse specialists
One of my favorite pages on the Monash Research website is the list of many current and a few notable past customers. (Another favorite page is the one for testimonials.) For a variety of reasons, I won’t undertake to be more precise about my current customer list than that. But I don’t think it would hurt anything to list the data warehouse DBMS/appliance specialists in the group. They are:
- Aster Data
- Calpont
- DATAllegro
- Greenplum
- Infobright
- Netezza
- ParAccel
- Teradata
- Vertica
All of those are Monash Advantage members.
If you care about all this, you may also be interested in the rest of my standards and disclosures.
Categories: About this blog, Aster Data, Calpont, Data warehousing, DATAllegro, Greenplum, Infobright, Netezza, ParAccel, Teradata, Vertica Systems | 3 Comments |
Netezza update
In my usual dual role, I called Phil Francisco of Netezza to lay some post-Microsoft/DATAllegro consulting on him late on a Friday night — and then took the opportunity of being on the phone with him to get a general Netezza update. Netezza’s July quarter just ended, so they’re still in quiet period, so I didn’t press him for a lot of numerical detail. More generally, I didn’t find a lot out that wasn’t already covered in my May Netezza update. But notwithstanding all those disclaimers, it was still a pretty interesting chat. Read more
Categories: Data warehouse appliances, Data warehousing, Greenplum, Netezza, Sybase | 3 Comments |
Data warehouse appliance power user TEOCO
If you had to name super-high-end users of data warehouse technology, your list might start with a few retailers, credit data processors, and telcos, plus the US intelligence establishment. Well, it turns out that TEOCO runs outsourced data warehouses for several of the top US telcos, making it one of the top data warehouse technology users around.
A few weeks ago, I had a fascinating chat with John Devolites of TEOCO. Highlights included:
- TEOCO runs a >200 TB DATAllegro warehouse for a major US telco. (When we hear about a big DATAllegro telco site that’s been in production for a while, that’s surely the one they’re talking about.)
- TEOCO runs around 450 TB total of DATAllegro databases across its various customers. (When Stuart Frost blogs of >400 TB “systems,” that may be what he’s talking about.)
- TEOCO likes DATAllegro better than Netezza, although the margin is now small. This is mainly for financial reasons, specifically price-per-terabyte. When TEOCO spends its own money without customer direction as to appliance brand, it buys DATAllegro.
- TEOCO runs at least one 50 TB Netezza system — originally due to an acquisition of a Netezza user — with more coming. There also is more DATAllegro coming.
- TEOCO feels 15-30 concurrent users is the current practical limit for both DATAllegro and Netezza. That’s greater than it used to be.
- Netezza is a little faster than DATAllegro on a few esoteric queries, but the difference is not important to TEOCO’s business.
- Official price lists notwithstanding, TEOCO sees prices as being in the $10K/TB range. DATAllegro’s price advantage has shrunk greatly, as others have come down to more or less match. However, since John stated his price preference for DATAllegro as being in the present tense, I presume the price match isn’t perfect.
- Teradata was never a serious consideration, for price reasons.
- In the original POC a few years ago, the incumbent Oracle — even after extensive engineering — couldn’t get an important query down under 8 hours of running time. DATAllegro and Netezza both handled it in 2-3 minutes. Similarly, Oracle couldn’t get the load time for 100 million call detail records (CDRs) below 24 hours.
- Applications sound pretty standard for telecom: Lots of CDR processing — 550 million/day on the big DATAllegro system cited above. Pricing and fraud checking. Some data staging for legal reasons (giving the NSA what it subpoenas and no more).
Categories: Analytic technologies, Data mart outsourcing, Data warehouse appliances, Data warehousing, DATAllegro, Netezza, Pricing, Specific users, Telecommunications, TEOCO | 7 Comments |
Netezza on compression
Phil Francisco put up a nice post on Netezza’s company blog about a month ago, explaining the Netezza compression story. Highlights include:
- Like other row-based vendors, Netezza compresses data on a column-by-column basis, then stores the results in rows. This is obviously something of a limitation — no run-length encoding for them — but can surely accommodate several major compression techniques.
- The Netezza “Compress Engine” compresses data on a block-by-block basis. This is a disadvantage for row-based systems vs. columnar ones in the area of compression, because columnar systems have more values per block to play with, and that yields higher degrees of compression. And among row-based systems, typical block size is an indicator of compression success. Thus, DATAllegro probably does a little better at compression than Netezza, and Netezza does a lot better at compression than Teradata.
- Netezza calls its compression “compilation.” The blog post doesn’t make the reason clear. And the one reason I can recall confuses me. Netezza once said the compression extends at least somewhat to columns with calculated values. But that seems odd, as Netezza only has a very limited capability for materialized views.
- Netezza pays the processing cost of compression in the FPGA, not the microprocessor. And so Netezza spins the overhead of the Compress Engine as being zero or free. That’s actually not ridiculous, since Netezza seems to have still-unused real estate on the FPGA for new features like compression. Read more
Categories: Analytic technologies, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, Netezza, Theory and architecture | 2 Comments |
Netezza has an EMC deal too
Netezza has an EMC deal too. As befits a hardware vendor, Netezza has an actual OEM relationship with EMC, in which it is offering CLARiiONs built straight into NPS appliances. 5 TB of CLARiiON will be free in any Netezza system from 2 racks on upward. (A rack holds about 12.5 TB.) In addition, you’ll be able to buy 10 TB more of CLARiiON in every Netezza rack, if you want. The whole thing is supposed to ship before year-end. Read more
Categories: Analytic technologies, Data warehouse appliances, Data warehousing, EMC, Netezza | 5 Comments |
Netezza, enterprise data warehouses, and the 100 terabyte mark
Phil Francisco of Netezza checked in tonight with some news that will be embargoed for a few hours. While I had him on the phone anyway, I asked him about large databases and/or enterprise data warehouses. Highlights included:
- Netezza has one customer with 200 TB of user data. The name is confidential (but he told me who it was).
- Netezza has sold 15 or so of its NPS 10-800s, which are rated at 100 TB capacity.
- The second-largest database in production on Netezza is probably 80 TB or so at Catalina Marketing, which has been a Netezza early adopter all along.
- Netezza’s biggest users typically have a handful (literally — off the top of his head, Phil said “4 to 6”) of applications, each with its own primary set of fact tables.
- Each application-specific set of fact tables in such big-honking-data-mart installations is usually either of cardinality one, or else a small set sharing a common hash key.
- Phil insists Netezza isn’t exaggerating when it claims to have true enterprise data warehouse installations. What he means by an EDW is something that is an enterprise’s primary data warehouse, is used by lots of departments, draws data from lots of sources, has loads going on at various points during the day, and has 100s if not 1000s of total users.
- Netezza’s biggest EDW has about 30 TB of user data. Phil wouldn’t tell me the name of that customer.
Outsourced data marts
Call me slow on the uptake if you like, but it’s finally dawned on me that outsourced data marts are a nontrivial segment of the analytics business. For example:
- I was just briefed by Vertica, and got the impression that data mart outsourcers may be Vertica’s #3 vertical market, after financial services and telecom. Certainly it seems like they are Vertica’s #3 market if you bundle together data mart outsourcers and more conventional OEMs.
- When Netezza started out, a bunch of its early customers were credit data-based analytics outsourcers like Acxiom.
- After nagging DATAllegro for a production reference, I finally got a good one — TEOCO. TEOCO specializes in figuring out whether inter-carrier telcom bills are correct. While there’s certainly a transactional invoice-processing aspect to this, the business seems to hinge mainly around doing calculations to figure out correct charges.
- I was talking with Pervasive about Pervasive Datarush, a beta product that lets you do super-fast analytics on data even if you never load it into a DBMS in the first place. I challenged them for use cases. One user turns out to be an insurance claims rule-checking outsourcer.
- One of Infobright’s references is a French CRM analytics outsourcer, 1024 Degres.
- 1010data has built up a client base of 50-60, including a number of financial and retail blue-chippers, with a soup-to-nuts BI/analysis/columnar database stack.
- I haven’t heard much about Verix in a while, but their niche was combining internal sales figures with external point-of-sale/prescription data to assess retail (especially pharma) microtrends.
To a first approximation, here’s what I think is going on. Read more