Teradata
Analysis of data warehousing giant Teradata. Related subjects include:
Confusion about Teradata’s big customers
Evidently further attempts to get information on this subject would be fruitless, but anyhow:
- Teradata emailed me a couple of months ago saying something like that at that point they could count 16 petabyte-level customers. In response to my repeated requests for clarification, Teradata has explicitly refused to identify the metric used in reaching that conclusion.
- At some point Teradata did something — as per a tweet of his — to convince Neil Raden that they have 20 petabyte-class users.
- That tweet was made around the time that Teradata apparently showed a slide naming big users at the Strata conference (last week).
- If Teradata is counting the way they did three years ago, that count of 16 or 20 or whatever is probably inflated compared to, say, Vertica’s figure of 7 a few months back.
- Even so, it’s obvious — and not just from the eBay example — that Teradata has one of the most scalable analytic DBMS offerings around.
Categories: Petabyte-scale data management, Teradata | 9 Comments |
Hybrid-columnar soundbites
Busy couple of days talking with reporters. A few notes on hybrid-columnar analytic DBMS, all backed up by yesterday’s post on Teradata columnar:
- Oracle does not actually offer columnar I/O; the other three systems do. But see the “I won’t be surprised” part in yesterday’s Teradata post.
- Aster does not offer columnar compression; the other three do.
- EMC Greenplum and Teradata offer different kinds of ways to mix column and row storage in the same table; each has its advantages.
- Teradata generally has a more mature and capable offering than EMC Greenplum, for most purposes, whichever way you choose to organize your tables.
Edit: The Wall Street Journal got this wrong, writing that Teradata was the first-ever hybrid columnar system. Specifically, they wrote
While columnar technology has been around for years, Teradata says its product is unique because it allows users to include both columns and rows in the same database.
Googling on “Teradata To Unveil New Analytics Product To Speed Business Adoption” might get you around the paywall to see the offending piece.
Categories: Aster Data, Columnar database management, Data warehousing, Database compression, Greenplum, Teradata | 2 Comments |
Aster Database Release 5 and Teradata Aster appliance
It was obviously just a matter of time before there would be an Aster appliance from Teradata and some tuned bidirectional Teradata-Aster connectivity. These have now been announced. I didn’t notice anything particularly surprising in the details of either. About the biggest excitement is that Aster is traditionally a Red Hat shop, but for the purposes of appliance delivery has now embraced SUSE Linux.
Along with the announcements comes updated positioning such as:
- Better SQL than the MapReduce alternatives have.
- Better MapReduce than the SQL alternatives have.
- Easy(ier) way to do complex analytics on multi-structured data. (Aster has embraced that term.)
and of course
- Now also with Teradata’s beautifully engineered hardware and system management software!
Categories: Aster Data, Data warehouse appliances, Data warehousing, Predictive modeling and advanced analytics, Teradata, Workload management | Leave a Comment |
Teradata Columnar and Teradata 14 compression
Teradata is pre-announcing Teradata 14, for delivery by the end of this year, where by “Teradata 14” I mean the latest version of the DBMS that drives the classic Teradata product line. Teradata 14’s flagship feature is Teradata Columnar, a hybrid-columnar offering that follows in the footsteps of Greenplum (now part of EMC) and Aster Data (now part of Teradata).
The basic idea of Teradata Columnar is:
- Each table can be stored in Teradata in row format, column format, or a mix.
- You can do almost anything with a Teradata columnar table that you can do with a row-based one.
- If you choose column storage, you also get some new compression choices.
Categories: Archiving and information preservation, Columnar database management, Data warehousing, Database compression, Oracle, Rainstor, Teradata | 7 Comments |
Aster Data business trends
Last month, I reviewed with the Aster Data folks which markets they were targeting and selling into, subsequent to acquisition by their new orange overlords. The answers aren’t what they used to be. Aster no longer focuses much on what it used to call frontline (i.e., low-latency, operational) applications; those are of course a key strength for Teradata. Rather, Aster focuses on investigative analytics — they’ve long endorsed my use of the term — and on the batch run/scoring kinds of applications that inform operational systems.
Categories: Analytic technologies, Application areas, Aster Data, Data warehousing, DataStax, RDF and graphs, Surveillance and privacy, Teradata, Web analytics | 1 Comment |
Eight kinds of analytic database (Part 1)
Analytic data management technology has blossomed, leading to many questions along the lines of “So which products should I use for which category of problem?” The old EDW/data mart dichotomy is hopelessly outdated for that purpose, and adding a third category for “big data” is little help.
Let’s try eight categories instead. While no categorization is ever perfect, these each have at least some degree of technical homogeneity. Figuring out which types of analytic database you have or need — and in most cases you’ll need several — is a great early step in your analytic technology planning. Read more
What colleges should teach in analytics
Based on a Teradata press release calling attention to the small amount of explicit university instruction in business intelligence, I was asked:
Does BI really need a dedicated undergrad track? What sort of BI and analytics-related skills should students look to obtain now in order to be viable in the job marketplace five years out?
My answers were (slightly edited):
- Most important is a basic, intuitive understanding of statistical significance. If you’re looking at an apparent trend, is it real or just random variation?
- Also crucial are general analytic and quantitative problem-solving skills.
- One also should have a comfort level learning how to use new software tools.
- Everybody in business should have those skillsets. So should people in science, medicine, teaching, journalism, government, and most other vocations.
- The more analytically oriented should add basic programming skills, and basic knowledge of SQL. While SQL’s utter dominance is ebbing a bit, it still will be with us for a very long time.
Of course, there are more specialized skills also worth teaching, in a number of areas, starting with statistics and other predictive modeling technologies. But it’s OK to go through life not knowing those.
Categories: Analytic technologies, Business intelligence, Data warehousing, NoSQL, Predictive modeling and advanced analytics, Teradata | 1 Comment |
Notes and links, June 15, 2011
Five things: Read more
In-memory, parallel, not-in-database SAS HPA does make sense after all
I talked with SAS about its new approach to parallel modeling. The two key points are:
- SAS no longer plans to go as far with in-database modeling as it previously intended.
- Rather, SAS plans to run in RAM on MPP DBMS appliances, exploiting MPI (Message Passing Interface).
The whole thing is called SAS HPA (High-Performance Analytics), in an obvious reference to HPC (High-Performance Computing). It will run initially on RAM-heavy appliances from Teradata and EMC Greenplum.
A lot of what’s going on here is that SAS found it annoyingly difficult to parallelize modeling within the framework of a massively parallel DBMS such as Teradata. Notes on that aspect include:
- SAS wasn’t exploiting the capabilities of individual DBMS to their fullest; rather, it was looking for an approach that would work across multiple brands of DBMS. Thus, for example, the fact that Aster’s analytic platform architecture is more flexible or powerful than Teradata’s didn’t help much with making SAS run within the Aster nCluster database.
- Notwithstanding everything else, SAS did make a certain set of modeling procedures run in-database.
- SAS’ previous plans to run in-database modeling in Aster and/or Netezza DBMS may never come to fruition.
Use cases for low-latency analytics
At various times I’ve noted the varying latency requirements of different analytic use cases, which can be as different as the speed of a turtle is from the speed of light. In particular, back when I wrote more about CEP (Complex Event Processing), I listed some applications for super-low-latency and not-so-low-latency CEP alike. Even better were some longish lists of “active data warehousing” use cases I got from Teradata in August, 2009, generally focused on interactive customer response (e.g. personalization, churn prevention, upsell, antifraud) or in some cases logistics.
In the slide deck for the Teradata 6680/solid-state drive announcement, however, Teradata went in a slightly different direction. In its list of “hot data use case examples”, Teradata suggested: Read more
Categories: Data warehousing, Teradata | 2 Comments |