March 23, 2011
DataStax introduces a Cassandra-based Hadoop distribution called Brisk
Cassandra company DataStax is introducing a Hadoop distribution called Brisk, for use cases that combine short-request and analytic processing. Brisk in essence replaces HDFS (Hadoop Distributed File System) with a Cassandra-based file system called CassandraFS. The whole thing is due to be released (Apache open source) within the next 45 days.
The core claims for Cassandra/Brisk/CassandraFS are:
- CassandraFS has the same interface as HDFS. So, in particular, you should be able to use most Hadoop add-ons with Brisk.
- CassandraFS has comparable performance to HDFS on sequential scans. That’s without predicate pushdown to Cassandra, which is Coming Soon but won’t be in the first Brisk release.
- Brisk/CassandraFS is much easier to administer than HDFS. In particular, there are no NameNodes, JobTracker single points of failure, or any other form of head node. Brisk/CassandraFS is strictly peer-to-peer.
- Cassandra is far superior to HBase for short-request use cases, specifically with 5-6X the random-access performance.
There’s a pretty good white paper around all this, which also recites general Cassandra claims — [edit] and here at last is the link.
Categories: Cassandra, DataStax, Hadoop, HBase, MapReduce, Open source
Subscribe to our complete feed!
Comments
3 Responses to “DataStax introduces a Cassandra-based Hadoop distribution called Brisk”
Leave a Reply
[…] DataStax thinks you should blend HDFS and Cassandra. […]
In my understanding JobTracker is still be needed – since it is part of the MapReduce and not HDFS.
What is very interesting – is to understand architecture of CassandraFS, and how Brisk ensure good data locality for the processing.
Hadoop map reduce has serious per-task overhead, so objects should be big for efficient processing. I am curious how Brisk with CassandraFS do solve it, since cassandra is built to stroe many small objects, not a few large ones.
[…] came out with a Hadoop-on-Cassandra offering called Brisk. For a while, it sounded as if Hadoop was as big a focus for DataStax as Cassandra […]