January 27, 2009
Introduction to Pentaho
I finally caught up with Pentaho, which along with Jaspersoft is one of the two most visible open source business intelligence companies, Actuate perhaps excepted. Highlights included:
- Much like Jaspersoft, Pentaho’s initial focus was mainly on embedded, operational BI.
- However, Pentaho now feels it has a decent end-user GUI as well, and traditional-BI is a bigger part of sales.
- Also, some sales are focused on data integration, perhaps in support of more traditional BI products. Pentaho has even had an Ab Initio replacement in data integration. (Can there be any change more extreme than going from Ab Initio to open source?)
- As an example of technical breadth, Pentaho says that its Mondrian OLAP engine is used by Jaspersoft.
- Pentaho has Excel output, but not in the form of live formulas.
- Pentaho does XQuery.
- Industries with more Pentaho adoption than average include:
- Financial services (traditionally open-source-friendly, according to Pentaho)
- Government (ditto)
- Web 2.0 (obviously ditto)
- Travel/transportation (cash-strapped)
- Frontier Airlines is a Pentaho/Greenplum customer.
- TradeDoubler is a Pentaho/InfoBright customer. (Pentaho thinks that TradeDoubler reloads its warehouse every day, which if true frankly casts some doubt on InfoBright’s architecture.)
- Data mining is something of a Pentaho sideline. There’s some university in New Zealand that built data mining capabilities in Pentaho, and some data mining research is done in that. Separately, Pentaho has been integrated with R.
- Community contributions are concentrated in the areas you’d expect — features some user or system integrator needs for a specific project, connectors, bug reports, and the like.
The briefing included one of the better slide decks I’ve seen in a while, which Pentaho gave me permission to share (in somewhat abbreviated form) here. In particular, Pentaho provided customer examples illustrating most of the use cases cited above.
Pentaho facts and figures include:
- Pentaho was founded in 2004. The first dozen or so reference customers were acquired in 2007. Before that usage of the product was mainly downloads of a free version.
- Actually, Pentaho’s free usage is more focused on embedded libraries, while paid usage is more skewed to traditional BI.
- Pentaho’s average selling price is $24-25K for first year revenue, which is extremely close to Jaspersoft’s figure.
- There are 100,000+ downloads per month, but Pentaho cautions that’s a very misleading figure. Some users download over 100 different pieces of the product, including for example all the national language support and all the different platform-specific support pieces.
- Pentaho doesn’t offer much in the way of more realistic metric of company size or success.
- Europe provides 35-40% of Pentaho revenue.
- Pentaho has at least one Asia/Pacific reference.
- 50% or so of Pentaho customers are on MySQL. Oracle and Postgres are in a rough tie for #2. That appears to be PostgreSQL rather than EnterpriseDB’s Postgres Plus.
Categories: Ab Initio Software, Application areas, Business intelligence, Data integration and middleware, EAI, EII, ETL, ELT, ETLT, Greenplum, Infobright, Jaspersoft, Pentaho, Pricing
Subscribe to our complete feed!
Comments
7 Responses to “Introduction to Pentaho”
Leave a Reply
another small unreasonable fact from the Pentaho power point: “On The Record” – Public Wins Over Proprietary BI” and then it appears US Navy had replaced Oracle BI, BO and Cognos with Pentaho
I’ve been using Pentaho Data Integration for some time now.
For an open source ETL tool it’s quite impressive. What I like most about it, is how transparent it is. Even the repository itself is modeled as a relational database that sits in our MySQL DB.
I have run into the odd bug, but the source and bug tracking are transparent, it’s easy to establish the problem and move forward with workarounds.
After using Open Source software for a while I’ve become accustomed to this transparency and am beginning to prefer it over traditional software.
Neil,
I’m guessing based on your email address you have a small consulting/SI firm. Have you used Pentaho at multiple installations? And have you ever had to actually pay them for it, or do the free options totally suffice? 🙂
Best,
CAM
Hi Curt and Pete,
Apologies if we caused any confusion, but it’s worth clarifying a couple of points.
– The “wins over proprietary BI” are wins over proprietary BI. Some of them are replacements, but the majority are competitive wins for [new] projects, and we try to be clear about that although it’s not totally obvious in the slide. And we don’t count something as a “win” just because the customer mentions another vendor, only if they tell us that they were seriously considering selecting that vendor.
– Download numbers can frequently be misleading in a general sense in open source. It’s good to have high downloads, and to see downloads growing, especially when you’re getting started as an open source project as we were 4 years ago. But these days, we actively try to reduce “download inflation” by bundling multiple modules into a single install, making it clearer what people do and don’t need, etc. But as I said, not something that is unique to Pentaho from my conversations with others in open source.
– Data mining isn’t as mainstream as traditional Query, Reporting, and Analysis in the BI market generally, but it’s definitely an important part of our offering. The UK National Health Service is a good real-world example. That said, a lot of the use is either in academia, or is considered too competitively sensitive for companies to talk publicly about from the ones that we have approached for case studies, so we don’t have as many for that as we do for other products or our BI Suite.
Thanks again for the conversation, Curt.
-Lance
Hi Curt,
I feel the need to clarify one point here, Pentaho’s statement that “As an example of technical breadth, Pentaho says that its Mondrian OLAP engine is used by Jaspersoft.” In point of fact, Mondrian is a shared project who’s copyrights are owned by many individuals and organizations, including Sherman Wood, the BI architect at Jaspersoft. Sherman has been a committer for Mondrian since before Pentaho “adopted” the project, and while they have done a great job of creating the impression they “own” the project, they do not. Julian Hyde, the Mondrian project lead, does some work for them, but spends most of his time as founder and Chief Architect at SQL*Stream. I agree that Lance does a fabulous job of marketing for the company, but on this point I take umbrage.
sorry I missed you at TDWI,
Nick
[…] both groups are pretty interested in open source software even so. (I think for both the price and customizability […]
[…] http://www.dbms2.com/2009/01/27/introduction-to-pentaho/ […]