Application areas
Posts focusing on the use of database and analytic technologies in specific application domains. Related subjects include:
- Any subcategory
- (in Text Technologies) Specific application areas for text analytics
Vertica finally spells out its compression claims
Omer Trajman of Vertica put up a must-read blog post spelling out detailed compression numbers, based on actual field experience (which I’d guess is from a combination of production systems and POCs):
- CDR – 8:1 (87%)
- Consumer Data – 30:1 (96%)
- Marketing Analytics – 20:1 (95%)
- Network logging – 60:1 (98%)
- Switch Level SNMP – 20:1 (95%)
- Trade and Quote Exchange – 5:1 (80%)
- Trade Execution Auditing Trails – 10:1 (90%)
- Weblog and Click-stream – 10:1 (90%)
It’s clear what Omer means by most of those categories from reading the post, but I’m a little fuzzy on what “Consumer Data” or “Marketing Analytics” comprise in his taxonomy. Anyhow, Omer’s post is a huge improvement over my recent one — based on a conversation with Omer 🙂 — which featured some far less accurate or complete compression numbers.
Omer goes on to claim that trickle-feed data is harder for rival systems to compress than it is for Vertica, and generally to claim that Vertica’s compression is typically severalfold better than that of competitive row-based systems.
Categories: Database compression, Vertica Systems, Web analytics | 5 Comments |
Oracle is integrating clickstream and network analytics too
Oracle announced today the not-so-concisely-named Oracle Real User Experience Insight, which actually seems to be an official nickname for what is more properly called “Oracle Enterprise Manager Real User Experience Insight.” Trying saying that 10 times straight at network speeds … but I digress.
If I’m reading things correctly, add Oracle to the already long list of vendors who see clickstream and network event analytics as being two sides of the same coin.
Categories: Analytic technologies, Oracle, Web analytics | 2 Comments |
Peter Batty on Netezza Spatial
As previously noted, I’m not up to speed on Netezza Spatial. Phil Francisco of Netezza has promised we’ll fix that ASAP. In the mean time, I found a blog by a guy named Peter Batty, who evidently:
- Knows a lot about geospatial data and its uses
- Is consulting to Netezza
- Is smart
Batty offers a lot of detail in two recent posts, intermixed with some gollygeewhiz about Netezza in general. If you’re interested in this stuff, Batty’s blog is well worth checking out. Read more
Categories: Analytic technologies, Data warehousing, GIS and geospatial, Netezza, Telecommunications | 2 Comments |
Database compression is heavily affected by the kind of data
I’ve written often of how different kinds or brands of data warehouse DBMS get very different compression figures. But I haven’t focused enough on how much compression figures can vary among different kinds of data. This was really brought home to me when Vertica told me that web analytics/clickstream data can often be compressed 60X in Vertica, while at the other extreme — some kind of floating point data, whose details I forget for now — they could only do 2.5X. Edit: Vertica has now posted much more accurate versions of those numbers. Infobright’s 30X compression reference at TradeDoubler seems to be for a clickstream-type app. Greenplum’s customer getting 7.5X — high for a row-based system — is managing clickstream data and related stuff. Bottom line:
When evaluating compression ratios — especially large ones — it is wise to inquire about the nature of the data.
Categories: Data warehousing, Database compression, Greenplum, Infobright, Vertica Systems, Web analytics | 4 Comments |
Web analytics — clickstream and network event data
It should surprise nobody that web analytics – and specifically clickstream data — is one of the biggest areas for high-end data warehousing. For example:
- I believe that both of the previously mentioned petabyte+ databases on Greenplum will feature clickstream data.
- Aster Data’s largest disclosed database, by almost two orders of magnitude, is at MySpace.
- Clickstream analytics is a big application area for Vertica Systems.
- Clickstream analytics is a big application area for Netezza.
- Infobright’s customer success stories appear to be concentrated in clickstream analytics.
- Coral8 tells me that CEP is also being used for clickstream data, although I suspect that a lot of Coral8’s evidence in that regard comes from a single flagship account. Edit: Actually, Coral8 has a bunch of clickstream customers.
Categories: Aleri and Coral8, Aster Data, Greenplum, Infobright, Netezza, Streaming and complex event processing (CEP), Vertica Systems, Web analytics | 2 Comments |
Netezza application areas
I’m at the Netezza “Enzee” user conference in Orlando. So one or more Netezza posts are in order.
One theme of the brief analyst meeting was Netezza’s increasing business focus on vertical markets. In particular, Netezza is hiring managers for a range of vertical markets. The commercial ones cited (at various levels of maturity) included: Read more
Categories: Application areas, Data warehouse appliances, Data warehousing, Market share and customer counts, Netezza, Telecommunications | Leave a Comment |
More mysteries regarding Oracle CDR load speed
Last spring, DATAllegro user John Devolites of TEOCO told me of troubles his firm had had loading CDRs (Call Detail Records) into Oracle, and how those had been instrumental in his eventual adoption of DATAllegro. That claim was contemptously challenged in a couple of comment threads.
Well, tonight at the Netezza user conference, Netezza gave awards to its first customers. The very first to accept was Jim Hayden, who’d bought Netezza for a company called Vibrant Solutions, which coincidentally was later acquired by TEOCO itself. In front of hundreds of people, he talked about how, back in 2003, it had taken 23 hours to load 400 million CDRs into Oracle on Nextel’s behalf, but only 40 minutes on Netezza.
And I’ll erase the rest of what I’d drafted here, as it was dripping in sarcasm …
Categories: Data warehousing, Netezza, Oracle, Telecommunications, TEOCO | 2 Comments |
Some Netezza customer metrics
From the conference call based on Netezza’s July, 2008 Q1, as of the end of Q1:
- There are now 191 Netezza customers.
- 18 of those were new.
- 78% of Netezza’s business was in North America and 22% was international.
- Netezza operates in 10 countries.
- “The top 4 vertical markets represented approximately 75% of our business, with those markets being telcos, retail, financial services, and the analytic service provider segment. “
- One analytic service provider was greater than 10% of revenue for the quarter, and is expected to keep buying a lot in subsequent quarters. Also, one analytic service provider standardized on Netezza. I’m guessing that’s the same customer.
- “We ended the quarter with 45 [quota] carrying teams made up of a sales rep and a systems engineer and our plan is to continue to hire direct sales teams at the pace of 3 to 5 per quarter every quarter. These direct reps accounted for 85% of the business while the indirect activity was 15% this quarter.”
Categories: Application areas, Data mart outsourcing, Data warehouse appliances, Data warehousing, Market share and customer counts, Netezza, Telecommunications | 1 Comment |
Teradata’s major vertical markets in 2007
From a May, 2008 earnings conference call transcript:
- telecommunication, media and entertainment industry is 28%;
- financial services is 24%;
- retail is 19% of our revenues last year;
- manufacturing 9%;
- government 7%;
- travel and transportation 6%;
- and healthcare 5%.
Categories: Application areas, Data warehouse appliances, Data warehousing, Telecommunications, Teradata | Leave a Comment |
A NoteWorthy win for Intersystems Cache’
A small Microsoft SQL Server-based medical application vendor called NoteWorthy Medical Systems bought a small Intersystems Cache’-based medical application vendor called Mars Medical Systems. NoteWorthy then decided to rebuild its product line on Intersystems Cache’. A press release ensued.*
*In general, my criticisms of Intersystems’ stealth marketing are beginning to be relaxed. On the other hand, if you want to be technical, I still haven’t actually talked with the company for years …
I spoke briefly with Mark Conner, founder of Mars Medical and now EVP of NoteWorthy, about why he so loves Cache’. (I asked what he disliked about the product; his response was an emphatic “Nothing”.) It basically boils down to two reasons:
-
Mark thinks hierarchical data models are a great fit for medical applications. For example, the application’s UI (and local schema) look quite different depending on which particular complaints or diagnoses apply to particular patient visits.
-
Cache’ just runs and runs w/o DBA intervention. Mark cited a figure of two support engineers for Mars Medical, supporting over 1,000 medical (largely group) practices, almost none of which have DBAs.
The latter feature is crucial to small ISVs selling application software to even smaller users, and is a big part of why Progress and Intersystems have large share in that market. More generally, it’s the most important and common technical advantage that mid-range database management systems generally enjoy versus the market leaders. (The other big advantage, of course, is pricing.)