Talend update
I chatted yesterday at TDWI with Yves de Montcheuil of Talend, as a follow-up to some chats at Teradata Partners in October. This time around I got more metrics, including:
- Talend revenue grew 6-fold in 2008.
- Talend revenue is expected to grow 3-fold in 2009.
- Talend had >400 paying customers at the end of 2008.
- Talend estimates it has >200,000 active users. This is based on who gets automated updates, looks at documentation, etc.
- ~1/3 of Talend’s revenue is from large customers. 2/3 is from the mid-market.
- Talend has had ~700,000 downloads of its core product, and >3.3 million downloads in all (including documentation, upgrades, etc.)
It seems that Talend’s revenue was somewhat shy of $10 million in 2008.
Specific large paying customers Yves mentioned include: Read more
Categories: Analytic technologies, Data integration and middleware, EAI, EII, ETL, ELT, ETLT, eBay, Market share and customer counts, Specific users, Talend | 5 Comments |
Update on Aster Data Systems and nCluster
I spent a few hours at Aster Data on my West Coast swing last week, which has now officially put out Version 3 of nCluster. Highlights included: Read more
eBay doesn’t love MapReduce
The first time I ever heard from Oliver Ratzesberger of eBay, the subject line of his email mentioned MapReduce. That was early this year. Subsequently, however, eBay seems to have become a MapReduce non-fan. The reason is simple: eBay’s parallel efficiency tests show that MapReduce leaves most processors idle most of the time. The specific figure they mentioned was parallel efficiency of 18%.
Categories: eBay, MapReduce, Parallelization | 7 Comments |
Teradata’s Petabyte Power Players
As previously hinted, Teradata has now announced 4 of the 5 members of its “Petabyte Power Players” club. These are enterprises with 1+ petabyte of data on Teradata equipment. As is commonly the case when Teradata discusses such figures, there’s some confusion as to how they’re actually counting. But as best I can tell, Teradata is counting: Read more
Categories: Data warehousing, eBay, Market share and customer counts, Petabyte-scale data management, Specific users, Teradata | 11 Comments |
Some of Oracle’s largest data warehouses
Googling around, I came across an Oracle presentation – given some time this year – that lists some of Oracle’s largest data warehouses. 10 databases total are listed with >16 TB, which is fairly consistent with Larry Ellison’s confession during the Exadata announcement that Oracle has trouble over 10 TB (which is something I’ve gotten a lot of flack from a few Oracle partisans for pointing out … 😀 ).
However, what’s being measured is probably not the same in all cases. For example, I think the Amazon 70 TB figure is obviously for spinning disk (elsewhere in the presentation it’s stated that Amazon has 71 TB of disk). But the 16 TB British Telecom figure probably is user data — indeed, it’s the same figure Computergram cited for BT user data way back in 2001.
The list is: Read more
Categories: Data warehousing, Oracle, Specific users, Telecommunications, Yahoo | 6 Comments |
More mysteries regarding Oracle CDR load speed
Last spring, DATAllegro user John Devolites of TEOCO told me of troubles his firm had had loading CDRs (Call Detail Records) into Oracle, and how those had been instrumental in his eventual adoption of DATAllegro. That claim was contemptously challenged in a couple of comment threads.
Well, tonight at the Netezza user conference, Netezza gave awards to its first customers. The very first to accept was Jim Hayden, who’d bought Netezza for a company called Vibrant Solutions, which coincidentally was later acquired by TEOCO itself. In front of hundreds of people, he talked about how, back in 2003, it had taken 23 hours to load 400 million CDRs into Oracle on Nextel’s behalf, but only 40 minutes on Netezza.
And I’ll erase the rest of what I’d drafted here, as it was dripping in sarcasm …
Categories: Data warehousing, Netezza, Oracle, Telecommunications, TEOCO | 2 Comments |
Teradata/Netezza/Tesco kerfuffle
Netezza evidently put out a press release bragging of a competitive replacement of Teradata at UK retailing giant Tesco. That press release cannot be now found on Netezza’s site, but it lives on elsewhere. Meanwhile, Teradata has put out a press release in which Tesco is quoted emphatically contradicting what it is quoted as saying in the Netezza press release. While I haven’t discussed this with Netezza, my guess is that somebody there got a little overenthusiastic in advance of their user conference next week and thought they’d gotten a permission they really hadn’t.
Beyond that, I’d note that the Netezza quote made reference to around 25 heavy analytical users, while the Teradata quote talked of 8000 people across more than 2000 suppliers.
Categories: Data warehouse appliances, Data warehousing, Memory-centric data management, Netezza, Oracle, Specific users, Teradata | 2 Comments |
Introduction to Aster Data and nCluster
I’ve been writing a lot about Greenplum since a recent visit. But on the same trip I met with Aster Data, and have talked with them further since. Let me now redress the balance and outline some highlights of the Aster Data story.
Categories: Analytic technologies, Aster Data, Data warehousing, Parallelization, Specific users | 4 Comments |
Greenplum’s single biggest customer
Greenplum offered a bit of clarification regarding the usage figures I posted last night. Everything on the list is in production, except that:
- One Greenplum customer is at 400 terabytes now, and upgrading to >1 petabyte “as we speak.”
- Greenplum’s other soon-to-be >1 petabyte customer isn’t in production yet. (Greenplum previously told me that customer was in the process of loading data right now.)
Categories: Data warehousing, Fox and MySpace, Greenplum, Petabyte-scale data management, Specific users | 3 Comments |
Project Cassandra — Facebook’s open sourced quasi-DBMS
Edit: I posted much fresher information about Cassandra in July, 2010.
Facebook has open-sourced Project Cassandra, an imitation of Google’s BigTable. Actual public information about Facebook’s Cassandra seems to reside in a few links that may be found on the Cassandra Project’s Google code page. All the discussion I’ve seen seems to be based solely on some slides from a SIGMOD presentation. In particular, Dare Obasanjo offers an excellent overview of Cassandra. To wit: Read more
Categories: Cassandra, Cloud computing, Data models and architecture, Facebook, NoSQL | 11 Comments |