IBM and DB2
Analysis of IBM and various of its product lines in database management, analytics, and data integration.
- Cognos
- solidDB
- (in The Monash Report) Operational and strategic issues for IBM
- (in Text Technologies) IBM in the text analytics market
- (in Software Memories) Historical notes on IBM
- (in Software Memories) Historical notes on Informix
DB2 Express-C — what IBM said
The following was received from IBM a few days before the DB2 Express Edition announcement. Due to an email glitch, it took a month for me to get permission to post it. Well, here it is, finally. The emphasis is all mine..
… IBM will announce DB2 Express-C, a free version of DB2 Express tailored for developer and partner communities. With the same core data server as DB2 Express Edition, DB2 Express-C will be free for download, development use, deployment, and redistribution. Support for DB2 Express-C will be available via a free online forum hosted and monitored by a community of DB2 experts. For traditional 24×7 IBM support, customers may acquire the appropriate license for DB2 Express Edition.
The following partners recognize the value DB2 Express-C adds to the solutions they provide their clients: ActiveGrid, AMD, Business Objects, Fourth Millenium Technologies, Intel, Mandriva, Mikropis, Nitix, Novell, Quest, Red Hat Linux, Retalon, Ubuntu Linux and Zend. Watch for news from many of these companies about distribution of DB2 Express-C with their offerings — for example Zend’s upcoming update to Zend Core for IBM.
DB2 Express-C Key Messages
Production-ready — DB2 Express-C provides development flexibility through support for a wide variety of software development environments and tools. DB2 Express-C also provides deployment and support flexibility by removing software license charges and offering free community support. In addition, DB2 Express-C places no database size limitations on developers.
DB2 Express-C also benefits from IBM investment in DB2 autonomics (self-management) technologies, performance optimization and resiliency. For example, its silent embedded installation and automatic object maintenance make DB2 Express-C well-suited for seamless integration into partner applications.
Developer community — Developers in a wide variety of development environments can draw on resources like developerWorks and alphaWorks to gain access to IBM support and emerging technologies from IBM research and development laboratories. Skills and applications developed with DB2 Express-C are directly applicable to all editions of DB2. At IBM, a DB2 Express-C community team has been formed to nurture DB2 community development and work with a variety of developer, ISV and open source community organizations.
Innovative technology — DB2 Express-C will be refreshed with the forthcoming “Viper” release of DB2 currently in beta test. DB2 “Viper” is the industry’s first hybrid data server -– deliveriing superior performance managing both XML and relational data structures. Early adopters have cited cost savings achieved through reduced development time and improved performance of applications using XML and relational data. These values provide critical advantages for the growing number of solutions implementing a service oriented architecture (SOA) and working with XML-based vertical industry standards.
DB2 Express-C Product Specifics
DB2 Express-C supports the Windows and Linux operating systems on various 32-bit and 64-bit processor architectures. Current Linux distributions supported include Novell Open Enterprise Server 9, Red Hat Enterprise Linux 3 and 4, SUSE Linux Enterprise Server 8 and 9, Asianux 1.0, Mandriva Corporate Server 3.0, Nitix 4.2.2a, Red Flag Advanced Server 4.1, and Ubuntu 5.04. Several Linux distributors will include DB2 Express-C in their Linux distributions. Details: http://ibm.com/db2/linux/validate
As with DB2 Express Edition, DB2 Express-C may be deployed on all systems with up to 2 processor cores, and on AMD or Intel x86 systems with up to 2 dual-core chips. 4 GB of memory is the maximum supported. There is no limit to database size however. Other editions of DB2 exist to support larger servers or clusters of servers with a seamless upgrade from DB2 Express-C.
No-charge community support for DB2 Express-C is provided via a public Web forum. For-fee support is available through purchase of a license for DB2 Express Edition, Workgroup Edition or Enterprise Server Edition.
Software development communities, environments and languages supported include PHP, Python, Perl, Rational Web Developer, .NET with Microsoft Visual Studio, Java with Eclipse, Quest Toad for DB2, Zend Studio and Zend Core for IBM.
Additional Information
For two examples of transaction processing performance with DB2 Express, visit:
On Linux: http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=104071601
On Windows: http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=104041401
NOTE: These benchmark results were published prior to the announcement of DB2 Express-C and were run using DB2 Express Edition. Price/performance metrics do not reflect the absence of support charges for DB2 Express-C.
For more on DB2 Express-C, including a download link, please visit http://ibm.com/db2/express . (This link will be activated at the time of our announcement.) More information on IBM Information Management Software can be found at http://ibm.com/software/data , and more information on DB2 can be found at http://ibm.com/db2 . DB2 Magazine is available at http://www.db2mag.com .
Categories: IBM and DB2 | Leave a Comment |
DB2 Express-C
IBM announced the freeware version of DB2 today. I’ll post links to the details later, but I want to highlight a couple of interesting implications:
1. They define the cutoff between the free and paid version not by how big a database you can manage on disk, but rather by how much RAM the software can address. This supports my thesis that effective use of RAM is crucial to DBMS performance, and is corollary — specially optimized memory-centric data management products deserve a place in most large enterprises’ product portfolios.
2. Having a free version of DB2 lets one play with whatever features DB2 may have that simply aren’t available in other DBMS, to see if they’re worth using. And the most significant such feature, in my opinion, is native XML storage. Whatever else this product does or doesn’t accomplish, it may serve to speed adoption of IBM’s native XML server technology.
Categories: IBM and DB2, Memory-centric data management, Mid-range, OLTP, Structured documents | Leave a Comment |
SAP, MaxDB, and MySQL, updated
I’ve had a chance to clarify and correct my understanding of the relationship between SAP, MaxDB, and MySQL. The story is this:
- MySQL has the right to sell MaxDB, but apparently isn’t focusing much on that.
- The MySQL and MaxDB code lines are NOT merging, for technical reasons. For example, the older MaxDB does a lot of its own thread management, while MySQL relies on the operating system for that.
- When SAP thinks a DBMS is capable of running SAP’s apps, it adds the DBMS to its product catalog and resells it. Yes, even Oracle. That’s why all my discussions with SAP of MySQL’s enterprise-readiness quickly come back to an exhaustive multi-year certification process.
- My personal best guess as to when MySQL will be in SAP’s product catalog is 1 1/2 – 3 years from now.
And by the way, MaxDB’s share in SAP’s user base is about the same as DB2’s (at least DB2 for open systems). MaxDB is being aggressively supported, and nobody should get any ideas to the contrary!
Categories: IBM and DB2, MySQL, Open source, Oracle, SAP AG | 6 Comments |
Two kinds of DBMS extensibility
Microsoft took slight exception to my claim that they lack fully general DBMS extensibility. The claim is actually correct, but perhaps it could lead to confusion. And anyhow there’s a distinction here worth drawing, namely:
There are two different kinds of DBMS extensibility.
The first one, which Microsoft has introduced in SQL Server 2005 (but which other vendors have had for many years) is UDTs (User-Defined Types), sometimes in other systems called user-defined functions. These are in essence datatypes that are calculated functions of existing datatypes. You could use a UDT, for example, to make the NULLs in SQL go away, if you hate them. Or you can calculate bond interest according to the industry-standard “360 day year.” Columns of these datatypes can be treated just like other columns — one can use them in joins, one can index on them, the optimizer can be aware of them, etc.
The second one, commonly known by the horrible name of abstract datatypes (ADTs), is found mainly in Oracle, DB2, and previously the Informix/Illustra products. Also, if my memory is accurate, Ingres has a very partial capability along those lines, and PostgresSQL is said to be implementing them too. ADTs offer a way to add totally new datatypes into a relational system, with their own data access methods (e.g., index structures). That’s how a DBMS can incorporate a full-text index, or a geospatial datatype. It can also be a way to more efficiently implement something that would also work as a UDT.
In theory, Oracle et al. expose the capability to users to create ADTs. In practice, you need to be a professional DBMS developer to write them, and they done either by the DBMS vendors themselves, or by specialist DBMS companies. E.g., much geospatial data today is stored in ESRI add-ons to Oracle; ESRI of course offered a speciality geospatial DBMS before ADTs were on the market.
Basically, implementing a general ADT capability is a form of modularity that lets new datatypes be added more easily than if you don’t have it. But it’s not a total requirement for new datatypes. E.g., I was wrong about Microsoft’s native XML implementation; XML is actually managed in the relational system. (More on that in a subsequent post.)
Categories: Actian and Ingres, Data types, IBM and DB2, Microsoft and SQL*Server, Open source, Oracle, Theory and architecture | 3 Comments |
Native XML storage, Part 1 (technology)
IBM’s “Viper” version of DB2 is in open beta test, whatever that means, and Microsoft’s SQL Server 2005, nee Yukon, is in general release. Both have native XML capabilities surpassing Oracle’s – which is interesting in its own right, because it’s rare for either of those vendors to pull ahead of Oracle in an OLTP feature, and almost unprecedented for both to do so at once.
So let’s talk about native XML support, what it is, and who might or should care about it. (Well, the apps part is actually in a separate Part 2 post.) Most of this is based on research that’s several months old, but except for a scarcity of actual user interviews, that shouldn’t matter much.
There are two main non-native ways to put XML into a SQL database such as Oracle – shredding and LOBs (BLOBs or CLOBs – i.e., Binary or Character Large OBjects). Both can perform poorly, for different reasons. Shredding takes XML documents and distributes them among a bunch of tables. So one update in XML can become many updates when shredded, and one lookup in XML can become a complex join from shredded storage. LOB storage obviates those problems, but creates another – even when you’re only looking for part of a document, you have to retrieve and handle the whole thing, and the same goes for updates.
So native storage can be a good thing when you can afford neither the performance hit of shredding, nor of LOB storage, nor of any available hybrid. It also could be good if getting good performance from non-native storage, while possible, would create undue burdens on application development, or if there’s some other reason one or both of the shredding and LOB approaches isn’t viable.
One nice feature is that native-XML storage has almost no downside, at least if you get it from the high-end DBMS vendors. IBM, Oracle, and Microsoft have all worked out ways to have integrated query parsing and query optimization, while letting storage be more or less separate. More precisely, Oracle actually still sticks everything into one data store (hence the lack of native XML support), but allows near-infinite flexibility in how it is accessed. Microsoft has already had separate servers for tabular data, text, and MOLAP, although like Sybase, it doesn’t have general datatype extensibility that it can expose to customers, or exploit itself to provide a great variety of datatypes. IBM has had Oracle-like extensibility all along, although it hasn’t been quite as aggressive at exploiting it; now it’s introduced a separate-server option for XML. Both Microsoft and IBM claim that their administrative tools are slick enough that the DBA has little extra work from their offerings than would be present in a true single-server solution.
So how does the storage actually work? The basic idea is exactly what you’d think. Data is stored in name-value pairs, with pointers connecting parents to children. The secret sauce (and here I have less detail than I’d like) is the extra information that’s stored, either at the nodes directly, or in an overarching index. Obviously, there’s a tradeoff between update and retrieval speed. And equally obviously, I need to learn more of the particulars.
And on that somewhat lame note, let me point you at Part 2 of this post, which discusses whether and how this stuff will actually be used. (Preview: It will, big time – I think.)
Categories: IBM and DB2, Microsoft and SQL*Server, Oracle, Structured documents | 9 Comments |
EII marketing soup
In the comments to another thread, the subject of EII (Enterprise Information Integration) came up. It’s a tricky one, for several reasons.
First, it’s a marketing construction — a blend between between ETL (Extract, Transform, Load) and EAI (Enterprise Application Integration). It’s a legitimate category; all those things are getting smushed together as near-real-time apps become more prominent. Still, it’s also an attempt to grab marketing turf.
Second, it’s commonly associated with a marketing overreach — the claim that an EII “platform” or “suite” will do everything a DBMS does (almost), but fully and heterogeneously distributed as well. Yeah, right.
Third, two of the sharpest proponents have been acquired by behemoths that tend to obscure their acquirees marketing pitches — Ascential by IBM and SeeBeyond by Sun.
Fourth, some of the best grand integrated EII suites (at least the ones that started as ETL, which is the side I’m more familiar with) aren’t complete yet. So vendors didn’t want to be too clear for fear of freezing current sales. I’m referring here mainly to Ascential and Informatica. They told analysts of their grand plans, but they haven’t been so eager to openly publicize the full details.
Fifth, the area is getting integrated with development tools for composite applications. Good examples there are SeeBeyond and Intersystems’ Cache’.
Sixth, no EII vendors’ plans fully work unless they have full relational and XML integration, and nobody really has been doing a great job on that, typically being strong in one area or the other.
Obviously, this is an area I have to research actively; EII is the neuromuscular system that holds DBMS2 together. But all the research in the world won’t change the fact that as of now it’s the weak spot in the story. There’s lots of great database management technology, and lots of excellent reasons to use a variety of kinds of that technology in your enterprise. But the tools to knit the resulting heterogeneous databases together are still sadly deficient.
The end of the single-server DBMS vendor
For all practical purposes, there are no DBMS vendors left advocating single-server strategies. Oracle was the last one, but it just acquired in-memory data management vendor TimesTen, which will be used as a cache in front of high-performance Oracle databases. (It will also continue to be sold for stand-alone uses, especially in the financial trading and defense/intelligence markets.)
IBM’s Viper is a server-and-a-half story, with lots of integration over a dual-server (one relational, one native XML) base. IBM also is moving aggressively in data integration/federation, with Ascential and many other acquisitions. It also sells a broad range of database products itself, including two DB2s, several Informix products, and so on.
Microsoft also has a multi-server strategy. In its case, relational, text, and MOLAP storage are more separate than in Oracle’s or even IBM’s products; again, there’s a thick layer of technology on top integrating them. An eventual move to native XML storage will, one must imagine, be handled in the same way.
Smaller vendors Sybase and Progress also offer multiple DBMS each.
Teradata is a pretty big player with only one DBMS — but it’s specialized for data warehousing. Teradata is the first to tell you you should use something else for your classical transaction processing.
The Grand Unified Integrated Database theory is, so far as I can tell, quite dead. Some people just refuse to admit that fact.
Categories: Database diversity, IBM and DB2, In-memory DBMS, Microsoft and SQL*Server, MOLAP, Oracle, Progress, Apama, and DataDirect, Sybase, Teradata, Theory and architecture | 3 Comments |
Down with database consolidation!
As with all changes in information technology, the move to DBMS2 will largely be one of evolution. But it does have a couple of revolutionary aspects.
Short-term, the biggest change is a renunciation of database and DBMS vendor consolidation. Consolidation never has worked, it never will work, and as data integration technologies keep improving it’s not that important anyway.
IBM and Oracle offer really great, brilliantly complex data warehousing technology. But if you want the most bang for the buck, forget about them, and go instead with a specialty vendor. Depending on the specifics of your situation, Teradata, Netezza, Datallego, WhiteCross, or SAP may offer the best choice, and that list could be even longer.
Similarly, for generic OLTP data management, cheap and/or open source options are getting ever more attractive. Microsoft is a serious contender for applications that previously only Oracle and IBM could handle, while MySQL and maybe Ingres are moving up the food chain right behind.
In many cases, these alternative technologies are lower-cost across the board: Lower purchase price, lower ongoing maintenance fees, and lower administrative costs.
So what, again, is the case for consolidation?
Categories: Actian and Ingres, Analytic technologies, Data warehouse appliances, Database diversity, IBM and DB2, Kognitio, Memory-centric data management, Microsoft and SQL*Server, MOLAP, MySQL, Netezza, Open source, Oracle, SAP AG, Theory and architecture | Comments Off on Down with database consolidation! |