February 25, 2009
Partial overview of Ab Initio Software
Ab Initio is an absurdly secretive company, as per a couple of prior posts and the comment threads on same. But yesterday at TDWI I actually found civil people staffing an Ab Initio trade show booth. Based on that conversation and other tidbits, I think it’s fairly safe to say:
- Ab Initio sells high-end data integration software.
- Ab Initio commonly costs $1/2 million or so.
- Ab Initio’s core claims include:
- “It just works”
- Ab Initio has great performance, even on big tasks.
- Unlike many competitors, Ab Initio has an integrated product line written from scratch. (Hence the “Ab Initio” name.)
- Like most data integration software – Talend is an exception – Ab Initio includes an execution engine.
- Everybody agrees that Ab Initio’s software has great performance, although Talend claims to come close and Expressor claims to be faster yet. But rivals assert that besides having high license fees, Ab Initio’s software is also very consumptive of hardware resources. Certainly I’d suggest checking that aspect carefully if you ever get into an Ab Initio POC. Perhaps that’s what Ab Initio means by saying its software uses any and all hardware resources. 😉 )
- Price isn’t the only regard in which Ab Initio is hard to do business with. Another is secretive business practices. For example, Ab Initio – confident in the quality of its software – pushes prospects toward POCs (Proofs-Of-Concept). But it wraps so many NDA requirements around these that some prospects walk away.
- Not surprisingly, Ab Initio has added lots of features over the years, especially in response to prospect or customer requests. Examples I was given include:
- IBM OS/390 support (including COBOL copybooks, etc.)
- SOAP/XML support. Associated with that is a story that boils down to “With great encapsulation, one can change a complex system of data integration processes incrementally without going crazy.”
- A compressed file system that can directly store 100s of TBs of user data, with very fast query performance. Apparently, this is not at the extremes of inflexibility, as it is realistic to have up to 5-6 keys on a table (at least). Associated with that is a story that boils down to “Hey, if you’re only getting at something via web services, you’re limited in how you can query it anyway. Worst case – your needs expand and you decide to put the data back in a true DBMS after all.”
- Similarly, Ab Initio claims that its software is easier to use than rival, cobbled-together products. While Ab Initio may not dispute the existence of products that can get data integration tasks done more simply, it argues that these products do a lot less than Ab Initio does.
Categories: Ab Initio Software, Analytic technologies, Benchmarks and POCs, Data integration and middleware, EAI, EII, ETL, ELT, ETLT, Expressor, Pricing, Talend
Subscribe to our complete feed!
Comments
14 Responses to “Partial overview of Ab Initio Software”
Leave a Reply
Having used AbInitio, Talend and almost every other ETL/ELT product on the market. I would very much doubt such a claim by Talend that they come close to AbInitio in performance. AbInitio has an amazing capacity to squeeze out every ounce of performance from the hardware it is running on. I do understand why they are so secret, their approach is elegant and simple but could be easily copied.
ab initio’s secrecy has since 1995 always been part of their marketing strategy. I.e., generate buzz in inside circles. That being said, they had some sharp technologists at the outset focusing on building a very scaleable high performance platform bundled with professional services to ensure successful deployment.
That being said, it will be interesting to see where mapreduce-based ETL solutions will be in perhaps 5 years. I think parity is at least that far off given the lack of pipelining and the significant investment in application-layer tools for data hygiene, precedence, conflict resolution, change management, etc. But I never bet long term against the power of an open-source movement with momentum.
Clover ETL is as good as Ab Initio at a fraction of cost. It is also very easy to use (if not easier than Ab Initio)
A book on Ab Initio for those want a head start – http://www.lulu.com/product/paperback/ab-initio-etl-made-easy/11920613
Evidently that book on Ab Initio is gone– the Secrecy Squad must have gotten to them…
The Clover ETL guys reached out to me, then didn’t follow up.
Sometimes I think the whole ETL industry is screwy. (Just kidding … partly.)
The book is here
http://www.lulu.com/product/paperback/ab-initio-etl-made-easy/15145817
Is this a Spanish software made in Mexico? And what’s up with the ‘Ab’ word? It sounds like one of those late night infomercial advertising
Not Latin America. Just Latin.
Looks like the ab Initio book is gone again from Lulu. Do they have Silencer Squads out that take out anyone who leaks the Secret Lore?
I’d love to know whether their usability really is what it’s cracked up to be– the few screenshots I have found on the web looked like nothing special, and the rest of the world is still moving… As far as squeezing all the performance out of your hardware, I can only speculate what they’re doing… maybe they’re really good at tuning integration/transformation to e.g. fit within the large caches on modern CPUs. And in the age of Hadoop, can they still justify their price point in terms of high performance on massive data volumes? so many unknowns.
Having worked multiple years with Ab Initio GDE, Co>OS, Technical Repository and their Metadata Hub up until v3.1.7.5, I recently used IBM DataStage v8.7:
First impression was: OK, this looks quite similar, components have become stages. Record formats are now specified on the links instead of in ports of components, makes more sense!
Second impression: NO WAY, DataStage, your user interface is this bad? I have to jump through hoops to write an expression containing an input field in an Transformer stage (equivalent of Reformat component)? I can only perform a lookup in a lookup stage?
There is no integrated source control in DataStage v8.7, no way to know if another developer worked on a parallel job (equivalent of graph) yesterday and is not finished editing it. Setting up a source control workflow using Git would suck because you’d have to explicitly export every job after every change going through a cumbersome slow export process…
What I’m saying, I’ll choose Ab Initio over DataStage anytime but I suppose tools like Pentaho or SAS DI studio should be better than DataStage as well.
3 more enormous advantages for Ab Initio:
– Their documentation contain clear documentation with clear useful examples, written by developers themselves I’d say. DataStage documentation is one of the worst I’ve seen.
– Ab Initio support is heaven on earth as well, you send them the error of your graph that can’t read data in parallel from a Teradata DB over 32 instances and they’ll tell you Teradata only accepts 4 instances by default. Expert specialist advice about a product they just interface with.
– Debugging at every point between every 2 components (by enabling a watcher). Sounds normal until you use DataStage: put a Copy and a Watch stage in between your flow, copy over the record formats and you can look at a few samples of data… Unbelievable.
the AI book can not be found there in lulu. Can someone post the new url?
A nice overview is here http://www.nimbusninetyignite.com/downloads/abinitio/abinitio_1.pdf