IBM and DB2
Analysis of IBM and various of its product lines in database management, analytics, and data integration.
- Cognos
- solidDB
- (in The Monash Report) Operational and strategic issues for IBM
- (in Text Technologies) IBM in the text analytics market
- (in Software Memories) Historical notes on IBM
- (in Software Memories) Historical notes on Informix
Some business trends in the data warehouse market
In recent conversations with various analytic DBMS vendors, a fairly consistent picture has emerged.
- Business is strong. Multiple vendors claim to be going gangbusters, with the happy sounds coming out of Vertica and Infobright being echoed by several competitors. Hearsay suggests some other companies in related businesses are doing well too. Depending on who you talk to, the business pickup dates back to Q4, give or take a quarter.
- Oracle Exadata has become a formidable competitor, on the strength of Exadata 2. Exadata 2’s positioning and perception among Oracle users seem to be pretty much in line with what Oracle portrayed to me.
- Teradata is portrayed as a weak competitor. Competitors don’t worry about Teradata nearly as much as they do about Oracle. That said, I suspect a bit of wishful thinking; Teradata is clearly still getting a lot of business the other vendors would dearly love to have.
- HP Neoview is reeling. (Almost) nobody sees Neoview competitively. The Walmart Neoview installation is said to have stayed small at best. JP Morgan Chase is said to have completely thrown Neoview out (and a bunch of HP engineers with it).
- (Almost) nobody mentions competing against DB2 either. This continues to baffle me.
Categories: Analytic technologies, Data warehousing, Exadata, HP and Neoview, IBM and DB2, JPMorgan Chase, Market share and customer counts, Oracle, Teradata | 4 Comments |
Intelligent Enterprise’s Editors’/Editor’s Choice list for 2010
As he has before, Intelligent Enterprise Editor Doug Henschen
- Personally selected annual lists of 12 “Most influential” companies and 36 “Companies to watch” in analytics- and database-related sectors.
- Made it clear that these are his personal selections.
- Nonetheless has called it an Editors’ Choice list, rather than Editor’s Choice. 🙂
(Actually, he’s really called it an “award.”)
Comments on the Gartner 2009/2010 Data Warehouse Database Management System Magic Quadrant
February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.
At intervals of a little over a year, Gartner Group publishes a Data Warehouse Database Management System Magic Quadrant. Gartner’s 2009 data warehouse DBMS Magic Quadrant — actually, January 2010 — is now out.* For many reasons, including those I noted in my comments on Gartner’s 2008 Data Warehouse DBMS Magic Quadrant, the Gartner quadrant pictures are a bad use of good research. Rather than rehash that this year, I’ll merely call out some points in the surrounding commentary that I find interesting or just plain strange. Read more
Xkoto Gridscale highlights
I talked yesterday with cofounders Albert Lee and Ariff Kassam of Xkoto. Highlights included: Read more
Categories: Clustering, IBM and DB2, Market share and customer counts, Microsoft and SQL*Server, Parallelization, Pricing, Xkoto | 15 Comments |
Initial reactions to IBM acquiring SPSS
IBM is acquiring SPSS. My initial thoughts (questions by Eric Lai of Computerworld) include:
1) good buy for IBM? why or why not?
Yes. The integration of predictive analytics with other analytic or operational technologies is still ahead of us, so there was a lot of value to be gained from SPSS beyond what it had standalone. (That said, I haven’t actually looked at the numbers, so I have no comment on the price.)
By the way, SPSS coined the phrase “predictive analytics”, with the rest of the industry then coming around to use it. As with all successful marketing phrases, it’s somewhat misleading, in that it’s not wholly focused on prediction.
2) how does it position IBM vs. competitors?
IBM’s ownership immediately makes SPSS a stronger competitor to SAS. Any advantage to the rest of IBM depends on the integration roadmap and execution.
3) How does this particularly affect SAP and SAS and Oracle, IBM’s closest competitors by revenue according to IDC’s figures?
If one of Oracle or SAP had bought SPSS, it would have given them a competitive advantage against the other, in the integration of predictive analytics with packaged operational apps. That’s a missed opportunity for each.
One notable point is that SPSS is more SQL-oriented than SAS. Thus, SPSS has gotten performance benefits from Oracle’s in-database data mining technology that SAS apparently hasn’t.
IBM’s done a good job of keeping its acquired products working well with Oracle and other competitive DBMS in the past, and SPSS will surely be no exception.
Obviously, if IBM does a good job of Cognos/SPSS integration, that’s bad for competitors, starting with Oracle and SAP/Business Objects. So far business intelligence/predictive analytics integration has been pretty minor, because nobody’s figured out how to do it right, but some day that will change. Hmm — I feel another “Future of … ” post coming on.
4) Do you predict further M&A?
Always. 🙂
Related links
- Official word from SPSS and IBM
- Blog posts from Larry Dignan and James Taylor
- James Kobelius‘s post, which includes the obvious point that Oracle — unlike SAP — has pretty decent data mining of its own
- Eric Lai‘s actual article
Categories: Analytic technologies, Cognos, IBM and DB2, Oracle, SAP AG, SAS Institute | 8 Comments |
Notes on CEP performance
I’ve been talking to CEP vendors on and off for a few years. So what I hear about performance is fairly patchwork. On the other hand, maybe 1-2+ year-old figures of per-core performance are still meaningful today. After all, Moore’s Law is being reflected more in core count than per-core performance, and it seems CEP vendors’ development efforts haven’t necessarily been concentrated on raw engine speed.
So anyway, what do you guys have to add to the following observations?
- Super-low-latency financial services industry tasks are often “embarrassingly parallel.” Thus, near-linear scale-out is common.
- That said, good parallelism seems fairly new in CEP engines (of course, CEP engines are fairly new themselves — for all I know, some have been parallel since inception).
- I’ve heard claims of up to 400,000 messages/second/core for simple queries or patterns.
- I’ve heard claims of 70,000 messages/core for not-so-simple queries or patterns, and probably higher than that depending on what the meaning of “simple” is.
- IBM just disclosed >15,000 messages/core on a pretty low-powered processor.
- I’ve heard that Coral8, Apama, and StreamBase rarely lost deals due to performance or throughput problems. I’ve heard that the same is not as true of Aleri.
- StreamBase proudly says it’s been fully multithreaded since academic research-project days. For Apama multithreading is evidently a more recent feature. But does it matter much?
Categories: Aleri and Coral8, IBM and DB2, Memory-centric data management, Progress, Apama, and DataDirect, StreamBase, Streaming and complex event processing (CEP) | 13 Comments |
Followup on IBM System S/InfoSphere Streams
After posting about IBM’s System S/InfoSphere Streams CEP offering, I sent three followup questions over to Jeff Jones. It seems simplest to just post the Q&A verbatim.
1. Just how many processors or cores does it take to get those 5 million messages/sec through? A little birdie says 4,000 cores. Read more
Categories: Analytic technologies, IBM and DB2, Investment research and trading, Streaming and complex event processing (CEP) | 7 Comments |
IBM System S Streams, aka InfoSphere Streams, aka stream processing, aka “please don’t call it CEP”
IBM has hastily announced System S Streams, a product that was supposed to be called InfoSphere Streams and introduced only in 2010. Apparently, the rush is because senior management wanted to talk about it later this week, and perhaps also because it was implicitly baked into some of IBM’s advertising already. Scrambling ensued. Even so, Jeff Jones and team got to me fast, and briefed me — fairly non-technically, unfortunately, but otherwise how I like it, namely on a harmless embargo and without any NDAs. That’s more than can be said for my clients at Microsoft, who also introduced CEP this week, but I digress …
*Indeed, as I draft this post-Celtics-game, the embargo is already expired.
Marketing aside, IBM System S/InfoSphere Streams is indeed a CEP/stream processing engine + language (with an Eclipse-based development environment). Apparently, IBM’s thinks InfoSphere Streams (if that’s what it winds up being renamed to) is or will be differentiated from other CEP packages in:
- Scale-out. (That’s the one that appears to be real today. In fact, there’s a prototype running on Blue Gene.)
- Support for complex datatypes such as XML, text, voice, video, etc.
- Security and general industrial-strengthness.
Categories: Analytic technologies, Application areas, IBM and DB2, Investment research and trading, Scientific research, Streaming and complex event processing (CEP) | 3 Comments |
Oracle’s hardware strategy
Larry Ellison stated clearly in an email interview with Reuters (links here and here) that Oracle intends to keep Sun’s hardware business and indeed intends to invest in the SPARC chip. Naturally, I have a few thoughts about this.
As Stephen O’Grady points out, Sun’s main strength lay in selling to the large enterprise market. Well, that’s Oracle’s overwhelming focus too. As I noted two years ago:
One Oracle response is to provide lots of add-on technologies for high-end customers, on the database and middle tiers alike. In app servers it’s done surprisingly well against BEA. It’s sold a lot of clustering. And it’s bought into and tried to popularize niche technologies like TimesTen and Tangosol’s.
This all makes perfect sense – it’s a great fit for Oracle’s best customers, and a way to get thousands of extra dollars per server from enterprises that may already have bought all-you-can-eat licenses to the Oracle DBMS. And being so sensible, it fits into the Clayton Christensen disruption story in two ways:
Oracle may be helpless against mid-tier competition, but it sure has the high-end core of its market locked up.
- As one type of technology is commoditized, value is created in other parts of the technology stack.
Oracle’s ongoing acquisition spree in system software, application software, and now hardware just supports that story. MySQL, embedded Java, and so on may be welcome to Oracle as yet more opportunities to tap additional markets — but Oracle’s emphasis is and surely will remain on the large enterprise market.
The next notable point may be found in Larry’s key quote: Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, HP and Neoview, IBM and DB2, Oracle | 8 Comments |
Some DB2 highlights
I chatted with IBM Thursday, about recent and imminent releases of DB2 (9.5 through 9.7). Highlights included:
- DB2 is getting Oracle emulation, which I posted about separately.
- IBM says that it had >50 new DB2 data warehouse customers last year. I neglected to ask how many of these had been general-purpose DB2 customers all along.
- By “data warehouse customer” I mean a user for InfoSphere Warehouse, which previously was called DB2’s DPF (Data Partitioning Feature). Apparently, this includes both logical and physical partitioning. E.g., DB2 isn’t shared-nothing without this feature.
- IBM is proud of DB2’s compression, which it claims commonly reaches 70-80%. It calls this “industry-leading” in comparison to Oracle, SQL Server, and other general-purpose relational DBMS.
- DB2 compression’s overall effect on performance stems from a trade-off between I/O (lessened) and CPU burden (increased). For OLTP workloads, this is about a wash. For data warehousing workloads, IBM says 20% performance improvement from compression is average.
- DB2 now has its version of one of my favorite Oracle security features, called Label Based Access Control. A label-control feature can make it much easier to secure data on a row-by-row, value-by-value basis. The obvious big user is national intelligence, followed by financial services. IBM says the health care industry also has interest in LBAC.
- Also in the security area, IBM reworked DB2’s audit feature for 9.5
- I think what I heard in our discussion of DB2 virtualization is:
- Increasingly, IBM is seeing production use of VMware, rather than just test/development.
- IBM believes it is a much closer partner to VMware than Oracle or Microsoft is, because it’s not pushing its own competing technology.
- Generally, virtualization is more important for OLTP workloads than data warehousing ones, because OLTP apps commonly only need part of the resources of a node while data warehousing often wants the whole node.
- AIX data warehousing is an exception. I think this is because AIX equates to big SMP boxes, and virtualization lets you spread out the data warehousing processing across more nodes, with the usual parallel I/O benefits.
- When IBM talks of new autonomic/self-tuning features in DB2, they’re used mainly for databases under 1 terabyte in size. Indeed, the self-tuning feature set doesn’t work with InfoSphere Warehouse.
- Even with the self-tuning feature it sounds as if you need at least a couple of DBA hours per instance per week, on average.
- DB2 on Linux/Unix/Windows has introduced some enhanced workload management features analogous to those long found in mainframe DB2. For example, resource allocation rules can be scheduled by time. (The point of workload management is to allocate resources such as CPU or I/O among the simultaneous queries or other tasks that contend for them.) Workload management rules can have thresholds for amounts of resources consumed, after which the priority for a task can go up (“Get it over with!”) or down (“Stop hogging my system!”).