Rocana’s world
For starters:
- My client Rocana is the renamed ScalingData, where Rocana is meant to signify ROot Cause ANAlysis.
- Rocana was founded by Omer Trajman, who I’ve referenced numerous times in the past, and who I gather is a former boss of …
- … cofounder Eric Sammer.
- Rocana recently told me it had 35 people.
- Rocana has a very small number of quite large customers.
Rocana portrays itself as offering next-generation IT operations monitoring software. As you might expect, this has two main use cases:
- Actual operations — figuring out exactly what isn’t working, ASAP.
- Security.
Rocana’s differentiation claims boil down to fast and accurate anomaly detection on large amounts of log data, including but not limited to:
- The sort of network data you’d generally think of — “everything” except packet-inspection stuff.
- Firewall output.
- Database server logs.
- Point-of-sale data (at a retailer).
- “Application data”, whatever that means. (Edit: See Tom Yates’ clarifying comment below.)
In line with segment leader Splunk’s pricing, data volumes in this area tend to be described in terms of new data/day. Rocana seems to start around 3 TB/day, which not coincidentally is a range that would generally be thought of as:
- Challenging for Splunk, and for the budgets of Splunk customers.
- Not a big problem for well-implemented Hadoop.
And so part of Rocana’s pitch, familiar to followers of analytic RDBMS and Hadoop alike, is “We keep and use all your data, unlike the legacy guys who make you throw some of it away up front.”
Since Rocana wants you to keep all your data, 3 TB/day is about 1 PB/year.
But really, that’s just saying that Rocana is an analytic stack built on Hadoop, using Hadoop for what people correctly think it’s well-suited for, done by guys who know a lot about Hadoop.
The cooler side of Rocana, to my tastes, is the actual analytics. Truth be told, I find almost any well thought out event-series analytics story cool. It’s an area much less mature than relational business intelligence, and accordingly with much more scope for innovation. On the visualization side, crucial aspects start:
- Charting over time (duh).
- Comparing widely disparate time intervals (e.g., current vs. historical/baseline).
- Whichever good features from relational BI apply to your use case as well.
Other important elements may be more data- or application-specific — and the fact that I don’t have a long list of particulars illustrates just how immature the area really is.
Even cooler is Rocana’s integration of predictive modeling and BI, about which I previously remarked:
The idea goes something like this:
- Suppose we have lots of logs about lots of things. Machine learning can help:
- Notice what’s an anomaly.
- Group together things that seem to be experiencing similar anomalies.
- That can inform a BI-plus interface for a human to figure out what is happening.
Makes sense to me.
So far as I can tell, predictive modeling is used to notice aberrant data (raw or derived). This is quickly used to define a subset of data to drill down to (e.g., certain kinds of information from certain machines in a certain period of time). Event-series BI/visualization then lets you see the flows that led to the aberrant result, which was any luck will allow you to find the exact place where the data first goes wrong. And that, one hopes, is something that the ops guys can quickly fix.
I think similar approaches could make sense in numerous application segments.
Related links
- Rocana’s Hadoop stack presumably includes both Kafka and Spark Streaming.
- Back when Splunk still answered my email, I wrote about its inverted-list data management architecture.
- Ursula Le Guin’s debut novel Rocannon’s World has nothing to do with this post (although it does start with a really lousy bit of temporal analysis 🙂 ). I just like making allusions to her work.
Comments
One Response to “Rocana’s world”
Leave a Reply
Per “application data”, mostly we’re referring to log and metric data from applications. We think it is incredibly important to co-locate this data with infrastructure logs for IT Operations Analytics. There are tons of data points out there stating that large percentages (30-60% usually) of application errors are caused by issues with the underlying infrastructure (ex – firewall config changes). As we’ve discussed, hard to correlate these when the data is in different systems…