February 11, 2008

eBay is over 5 petabytes now

Single largest database >1.4 petabytes.

From Oliver Ratzesberger’s LinkedIn profile:

Our systems process in excess of 10 billion records per day, serving thousands of users and delivering hundreds of millions of queries per month in a true global 24×7 operation with distributed teams around the globe on systems over 5 PB in size (largest single system >1.4PB).

January 25, 2008

A high write-volume MySQL user

Spinn3r crawls and indexes blogs. It says it covers 1 million blogs and 25K posts/hour, doing thousands of write transactions per second. And it does this into federated MySQL — but with a lot of software built on top. To wit: Read more

October 19, 2007

One Greenplum customer — 35 terabytes and growing fast

I was at the Business Objects conference this week, and as usual went to very few sessions. But one I did stroll into was on “Managing Rapid Growth With the Right BI Strategy.” This was by Reliance Telecommunications, an outfit in India that is adding telecom subscribers very quickly, and consequently banging 100-150 gigs of data per day into a 35 terabyte warehouse.

The beginning of the talk astonished me, as the presenter seemed to be saying they were doing all this on Oracle. Hah. Oracle is what they moved away from; instead, they got Greenplum. I couldn’t get details; indeed, as a BI guy he was far enough away from DBMS to misspeak and say that Greenplum was brought in by ‘HP’, before quickly correcting himself when prompted. Read more

October 9, 2007

Marketing versus reality on the one-petabyte barrier

Usually, I don’t engage in the kind of high-speed quick-response blogging I have over the past couple of days from the Teradata Partners conference (and more generally have for the past week or so). And I’m not sure it’s working out so well.

For example, the claim that Teradata has surpassd the one-petabyte mark comes as quite a surprise to variety of Teradata folks, not to mention at least one reliable outside anonymous correspondent. That claim may indeed be true about raw disk space on systems sold. But the real current upper limit, according to CTO Todd Walter,* is 5-700 terabytes of user data. He thinks half a dozen or so customers are in that range. I’d guess quite strongly that three of those are Wal-Mart, eBay, and an unspecified US intelligence agency.

*Teradata seems to have quite a few CTOs. But I’ve seen things much sillier than that in the titles department, and accordingly shan’t scoff further — at least on that particular subject. 😉

On the other hand, if anybody did want to buy a 10 petabyte system, Teradata could ship them one. And by the way, the Teradata people insist Sybase’s claims in the petabyte area are quite bogus. Teradata claims to have had bigger internal systems tested earlier than the one Sybase writes about.

October 8, 2007

Teradata apparently has crossed the petabyte barrier

According to a hurried conversation I had with Chief Marketing Office Darryl MacDonald, Teradata has customers with over 1 petabyte of user data in a single instance. He wouldn’t disclose any names, but I’d guess one is eBay, who he did confim is a customer. The intelligence area is another one where I’d speculate there are Very Large Databases.

However, since Darryl mentioned testing systems internally up to 4 petabytes, I’d guess the upper limit of Teradata deployments is in the 1-2 petabyte range.

EDIT: I’m now guessing that Teradata’s largest classified database — which previously was the largest overall — isn’t much over a petabyte in size. And there’s a strong chance this is larger than any unclassified one.

Update: That wasn’t really 1+ petabyte of user data.


August 8, 2006

eBay’s version of DBMS2

Every sufficiently large or agile enterprise needs to follow the DBMS2 approach. The following is from an article on eBay’s version:

“eBay has built a software-based Integration Tier. This contains both a data access layer (DAL) and a services framework. The Integration Tier acts as an abstraction layer for software engineers to work with many disparate back-end data sources through a consistent set of abstractions.”

July 25, 2006

Amazon’s version of DBMS2

Last year, I pointed out that Amazon has a highly diversified DBMS strategy. Now Mike Vizard has a great interview with Werner Vogel, Amazon’s CTO, where he unearths a lot more detail. And it turns out that Amazon has been a hardcore adopter of DBMS2, since long before DBMS2 was named.
Read more

October 10, 2005

The Amazon.com bookstore is a huge, modern OLTP app. So is it relational?

I don’t know for a fact that the Amazon.com bookstore is the world’s biggest OLTP application — but if it isn’t, it’s close.

And the thing is — that’s never been an entirely relational application. Oh, the ordering part surely is. But the inventory lookup is currently driven by an OODBMS (from Progress). The personalization used to be done in Red Brick (I knew which software replaced it, but I’m forgetting at the moment — it may even be one of the relational warehouse appliance vendors). And of course the full-text search is a custom in-house system.

← Previous Page

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.