February 8, 2012
Comments on SAS
A reporter interviewed me via IM about how CIOs should view SAS Institute and its products. Naturally, I have edited my comments (lightly) into a blog post. They turned out to be clustered into three groups, as follows:
- SAS faces a number of challenges, not unlike those faced by other high-priced legacy technology vendors.
- It is used by organizations who have large budgets to pay for the product and to pay people to be expert on the product’s intricacies.
- SAS has not integrated with scale-out analytic DBMS technologies as well or quickly as had been hoped, or as earlier marketing suggested was likely.
- SAS has not been strong in helping its users do agile predictive analytics.
- SAS’ strengths are concentrated in product breadth:
- Lots of statistical algorithms.
- Various vertical products that make the modeling techniques more accessible in specific application domains.
- Various approaches to engineering for scalability — no one of those has been a table-thumping success to date, but SAS has the resources to keep trying.
- Some level of integration with its own business intelligence and text analytics products.
- For any particular use case, the burden of proof is on SAS alternatives to show that they have enough pieces in the toolkit to meet the needs.
- SPSS (now owned by IBM) also has legacy issues.
- KXEN is focused on marketing use cases.
- Mahout has been one of the less successful Hadoop-related open source projects.
- R-based technology is still maturing.
- The modeling capabilities (as opposed to just scoring) bundled into RDBMS and well-parallelized tend to be pretty limited. Apparent exceptions tend to just be R repackaged.
Categories: Analytic technologies, Data warehousing, Hadoop, IBM and DB2, KXEN, Predictive modeling and advanced analytics, SAS Institute
Subscribe to our complete feed!
Comments
18 Responses to “Comments on SAS”
Leave a Reply
In terms of scale-out-analytic DBMS integration, the SAS products available today (with more to come shortly) are, SAS High Performance Analytics, SAS Scoring Accelerator and SAS Analytics Accelerator. All enable exploitation of massively parallel databases (MPP). They scale to 100s of nodes, 1000s of cores. They meet the needs of any “big data” analyst.
Not sure what you mean by “mahout has been one of the less successful Hadoop projects” and how that affects SAS
Had things gone differently, Mahout — with its integration into an important data store/ETL engine — might be a major threat to SAS right now.
SAS currently has no reference customers for High Peformance Analytics.
Market acceptance for Scoring Accelerator is also relatively low because it can only be used with SAS Enteprise Miner, while most analytic users continue to use SAS/Stat. Since SAS/Stat does not export PMML (or anything else), most firms opt to manually recode scoring jobs into something that will scale.
Several vendors compete with SAS for the use cases SAS projects for HPA. The difference is that while HPA works with structured data only, SAS’ competitors bridge traditional warehousing and Hadoop. Analytics that incorporate unstructured data outperform analytics that do not; this is settled science. SAS has not yet shipped an ACCESS engine for Hadoop, so it strikes me that they are a day late and a dollar short.
Couple of additional comments on SAS alternatives:
(1) SPSS has a “legacy” in the sense that it builds on existing technology and has a customer base. In the 1990s, SAS focused effort on client-server technology, while SPSS focused on the desktop; as a result, SAS developed a reputation for heavy number-crunching, while SPSS developed a reputation for usability. Given the direction of analytics, SAS’ legacy looks increasingly like a bug, while SPSS’ legacy is a feature.
(2) KXEN’s focus on Marketing reflects an understanding that nobody else is willing to drink their black-box “automated” Kool-Aid. There is nothing in the product that actually facilitates the kind of analysis Marketers do
(3) Not sure it helps to dismiss R as “maturing”; show me a technology that isn’t “maturing” and I’ll show you a technology that is dead. R is rapidly penetrating commercial analytics, especially so in the health and life sciences vertical. While most organizations use R to supplement other tools, a major global life-sciences company expects to move 100% of its analytics to R by 2014.
(4) In-database analytic packages generally don’t have the breadth of algorithms featured in server-based packages, but the 80/20 rule applies to analytics: the vast majority of analytic use cases can be covered with exactly four methods. There is no question that SAS has more “stuff” than a typical in-database package; the real question is whether or not you need all that stuff and are willing to pay the price for it.
Also, in-database analytics developed separately from R and are not derived from it. This is true for all database platforms. Running R in-database supplements native capabilities.
Thomas,
What are the “four methods” that you feel cover the vast majority of use cases, and which do you feel KXEN lacks?
Curt,
That’s two separate questions:
(1) Any analytics tool needs to be able to classify, estimate, cluster and associate. Reasonable people can agree or disagree about which algorithm is best for each task, but if I had to limit the choice I would go with CART for classification and estimation, k-means for clustering and fpgrowth for association.
(2) The issue with KXEN isn’t the algorithms they have or don’t have, but the overall black-boxiness of the affair. Not that there’s anything wrong with black-box analysis per se, but when analytics vendors are unwilling to disclose their algorithms, it’s likely because they’re using the same stuff that everyone else uses.
That said, if KXEN is able to get better results than competing tools, nobody will care how they do it — they can put trained crickets inside the box. But tinkering around with analytic tooling rarely produces results that translate to significant business benefits, for reasons I’ll spell out in a future blog post.
Additional thoughts on KXEN — no issue with KXEN’s product, which looks like a pretty good analytics platform if you can ignore the opaqueness.
The value proposition is puzzling, though. If you’re selling to the CMO, you need to do something the CMO cares about, like media mix optimization or audience optimization. The CMO does not care about analytics, but might be interested in a solution that happens to include analytics.
Unica figured this out fifteen years ago, when they morphed the Predictive Analytics Workbench into what is now the leading application for marketing automation.
KXEN rightly figures that they can’t sell into hard dollar analytics fields, such as risk, fraud, actuarial, health/life sciences or capital markets, where the quants don’t value “ease of use” and won’t tolerate black boxiness. That leaves Marketing, but it’s sort of a default positioning.
First with respect to “black-boxiness”, we have published white papers years ago outlining the fundamental approach we take (structured risk minimization). Furthermore, we have several blog posts explaining it for non PhDs in mathematics (see here http://www.kxen.com/blog/). The only people who seem to get nervous about the fine level details of our implementation tend to be competitors, not our customers (which include plenty of data scientists as well as marketers).
Second, we have plenty of reference accounts outside of CRM (fraud, operations, risk etc) but we have focused on CRM use cases because of the required agility, productivity, and data volumes (not just rows but columns) which play to our core strengths. We do get significant performance gains over traditional approaches as seen by the numerous customer testimonials you see on our web site (http://www.kxen.com/). And I would say that to state that marketers don’t care about analytics or optimizing their marketing operations is naive at best. I can count hundreds of our customers where it simply isn’t true (does your statement reflect Netezza positioning?)
More to come on our blog regarding the specific challenges in marketing that are in fact MORE demanding, not less, than some of the other domains.
John
Couple of points for the record:
(1) it’s structurAL risk minimization
(2) my company does not compete with KXEN
Since the great majority of citations for SRM are linked to KXEN, it doesn’t really help to give it a name; it’s still a black box.
Marketing executives do not care about analytics for the sake of analytics; they do care about optimizing marketing operations. Those are two very different things, the distinction is clear in my previous comment, and requires no further elaboration.
Thomas,
Since your company sells SPSS and quasi-sells R, it’s fair to regard you as a KXEN competitor.
My company sells a lot of different things, and in some cases both competes and partners at the same time in some categories, including analytics. We do not compete with KXEN.
I disagree with the claim that SPSS does not compete with KXEN.
SPSS is a completely different business unit. Netezza competes in the data warehouse appliance space. I spend 100 percent of my time working with customers who use other tools, mostly SAS. And if our customers want to use KXEN that’s fine with us, too.
Doubtful that the folks over at SPSS think KXEN competes in the same league, either, but I don’t speak for them.
In any case, the assertion that any critique of KXEN must be competitor FUD is simply silly.
Thomas,
I don’t recall anybody making that suggestion. But “I’m not sure whether Product X meets requirement Y” can be less compelling when it comes from another vendor than when it comes from a user, or even from a prospect who’s gone through a full sales cycle and indeed cares about the requirement in question.
I think this is one of those cases. So far as I can tell, KXEN provides enough transparency into its models for most purposes (gut feel validation, intuition as to how to enhance the modeling, and so on). If there’s one transparency area in which it may fall short, that would be regulatory compliance — which to date hasn’t been much of an issue in the marketing area.
(I’m not wholly convinced that regulatory compliance will remain such a small issue in the marketing area, but that’s a subject for other threads.)
Curt,
You asked for my opinion. And you knew my employer before you asked.
TD
Thomas,
It sounded like you were saying that your comments on KXEN should be regarded as more authoritative than the company’s CEO’s, because he had vendor bias and you didn’t. That’s what got you a bit of pushback.
Everybody is biased by, if nothing else, the skews in their available information. http://www.strategicmessaging.com/money-analyst-attention-and-implied-analyst-endorsement/2011/02/28/ If one’s conscientious, one can reduce that bias; but it’s very hard to eliminate entirely.
[…] SAS has an exceptionally broad feature set. But few parts of the SAS product line offer much in the way of […]