Riptano, and Cassandra adoption
Tonight’s Cassandra technology post got plenty long enough on its own, so I’m separating out business and adoption issues here. For starters, known Cassandra users include:
- Facebook, which has said it has 150 or so Cassandra nodes (but see below)
- Twitter, which has said it has 45 or so Cassandra nodes
- Rackspace, which used to be Jonathan Ellis’ employer, and now is backing Cassandra company Riptano
- Digg, which along with Twitter and Rackspace was one of the three major users helping advance the Cassandra project
- OpenX, Simple Geo, Digital Reasoning, who Jonathan cited as production users in March
- Cloudkick, as noted and linked in my other post
- Two customers Riptano named at launch (but I’ve forgotten who they were*)
Fetlife, Meebo, and others seem to at least have a healthy interest in Cassandra, based on their level of involvement in a forthcoming Cassandra Summit. That said, the @Fetlife tweetstream features numerous yelps of pain, and I don’t mean the recreational kind.
*And I can’t easily find a launch press release, whether on the rather minimalist Riptano website or elsewhere.
Beyond that, when Riptano launched in May, the Riptano guys (mainly Jonathan Ellis) said:
- They were sure there were dozens of Cassandra user organizations, maybe even >100. But there weren’t 100s.
- Maybe 20-40% of those Cassandra sites were in production. (But I don’t think I’d multiply that out to suggest there were, say, 35-50 production Cassandra users.)
- 4000 people were going daily to the Apache Cassandra site.
- There were 250 Cassandra downloads daily.
- Lots of startups were using Cassandra.
- Lots of other companies were looking at switching over to Cassandra.
- Many potential Cassandra users had been waiting for a Cassandra company to be available to support it.
- The median number of Cassandra (production?) nodes is probably 8-10. 4 would be a low end figure.
That’s a lot of adoption for a not-even-Release-1 open source project. Even so, there’s a feeling going around that Cassandra has lost some momentum the past couple of months. Most notably, Facebook, which created Cassandra in the first place, isn’t using it for new projects. True, I’m hearing even less evidence that any one of Membase, Voldemort, VoltDB, Akiban, Clustrix, or Riak – for example – is setting the world on fire than I am for Cassandra. But the viable Cassandra alternatives are piling up. Cassandra isn’t the only or even primary game in town, and for that matter I haven’t heard any concise description of a niche in which Cassandra is the unquestioned leader.
Edit: A/the Facebook project that continues to run on Cassandra is Inbox search.
As for Riptano itself:
- Riptano launched with two founders and immediately made an offer to a third guy. I don’t know how many folks they have now, two months later.
- Rackspace put some funding into Riptano.
- Riptano’s strategy sounds a lot like Cloudera’s, by which I mean:
- Riptano’s business is all services, whether training, consulting, or support.
- Riptano’s intended main business is obviously support.
- Notwithstanding the above, Riptano intends to eventually offer proprietary software, bundled with its support services.
- The first area of focus for that proprietary software is intended to be management tools.
- I wouldn’t be surprised if, like Cloudera, Riptano tweaks its software focus from “stuff that lets us support you better” to “integration with stuff you pay for.” Those strategies are actually pretty similar.
Riptano seems to be starting out with support pricing around $1,000-$4,000/server/year, before quantity discounts.
Comments
5 Responses to “Riptano, and Cassandra adoption”
Leave a Reply
[…] Riptano, and Cassandra adoption […]
[…] wrote in detail on Cassandra adoption last month. News since then […]
[…] Pfeil of Riptano […]
[…] The biggest deployment is at Facebook, where hundreds of terabytes of token indexes are kept in about a hundred Cassandra nodes. However, their use case allows the data to be rebuilt if something goes wrong. Proceed carefully, keep a backup in an unrelated storage engine…and submit patches if things go wrong. (Some other production deployments are listed here.) […]
[…] The biggest deployment is at Facebook, where hundreds of terabytes of token indexes are kept in about a hundred Cassandra nodes. However, their use case allows the data to be rebuilt if something goes wrong. Proceed carefully, keep a backup in an unrelated storage engine…and submit patches if things go wrong. (Some other production deployments are listed here.) […]