Predictive modeling and advanced analytics

Discussion of technologies and vendors in the overlapping areas of predictive analytics, predictive modeling, data mining, machine learning, Monte Carlo analysis, and other “advanced” analytics.

November 28, 2011

Agile predictive analytics – the heart of the matter

I’ve already suggested that several apparent issues in predictive analytic agility can be dismissed by straightforwardly applying best-of-breed technology, for example in analytic data management. At first blush, the same could be said about the actual analysis, which comprises:

Numerous statistical software vendors (or open source projects) help you with the second part; some make strong claims in the first area as well (e.g., my clients at KXEN). Even so, large enterprises typically have statistical silos, commonly featuring expensive annual SAS licenses and seemingly slow-moving SAS programmers.

As I see it, the predictive analytics workflow goes something like this Read more

November 28, 2011

Agile predictive analytics — the “easy” parts

I’m hearing a lot these days about agile predictive analytics, albeit rarely in those exact terms. The general idea is unassailable, in that it boils down to using data as quickly as reasonably possible. But discussing particulars is hard, for several reasons:

At least three of the generic arguments for agility apply to predictive analytics:

But the reasons to want agile predictive analytics don’t stop there.

Read more

November 12, 2011

Clarifying SAND’s customer metrics, positioning and technical story

Talking with my clients at SAND can be confusing. That said:

A few months ago, I wrote:

SAND Technology reported >600 total customers, including >100 direct.

Upon talking with the company, I need to revise that figure downward, from > 600 to 15.

Read more

November 8, 2011

Terminology: Operational analytics

It’s time for me to try to define “operational analytics”. Clues pointing me to that need include:

But as in all definitional discussions, please remember that nothing concise is ever precise.

Activities I want to call “operational analytics” include but are not limited to (and some of these overlap):   Read more

November 2, 2011

The cool aspects of Odiago WibiData

Christophe Bisciglia and Aaron Kimball have a new company.

WibiData is designed for management of, investigative analytics on, and operational analytics on consumer internet data, the main examples of which are web site traffic and personalization and their analogues for games and/or mobile devices. The core WibiData technology, built on HBase and Hadoop,* is a data management and analytic execution layer. That’s where the secret sauce resides. Also included are:

The whole thing is in beta, with about three (paying) beta customers.

*And Avro and so on.

The core ideas of WibiData include:

Read more

October 14, 2011

Commercial software for academic use

As Jacek Becla explained:

Even so, I think that academic researchers, in the natural and social sciences alike, commonly overlook the wealth of commercial software that could help them in their efforts.

I further think that the commercial software industry could do a better job of exposing its work to academics, where by “expose” I mean:

Reasons to do so include:

Read more

September 22, 2011

Aster Database Release 5 and Teradata Aster appliance

It was obviously just a matter of time before there would be an Aster appliance from Teradata and some tuned bidirectional Teradata-Aster connectivity. These have now been announced. I didn’t notice anything particularly surprising in the details of either. About the biggest excitement is that Aster is traditionally a Red Hat shop, but for the purposes of appliance delivery has now embraced SUSE Linux.

Along with the announcements comes updated positioning such as:

and of course

Read more

September 20, 2011

XLDB: The one conference I like to attend

I’m not a big fan of conferences, but I really like XLDB. Last year I got a lot out of XLDB, even though I couldn’t stay long (my elder care issues were in full swing). The year before I attended the whole thing — in Lyon, France, no less — and learned a lot more. This year’s XLDB conference is at SLAC — the organization formerly known as the Stanford Linear Accelerator Center — on Sand Hill Road in Menlo Park, October 18-19. As of right now, I plan to be there, at least on the first day. XLDB’s agenda and registration details (inexpensive) can be found on the XLDB conference website.

The only reason I wouldn’t go is if that turned out to be a lousy week for me to travel to California.

The people who go XLDB tend to be really smart — either research scientists, hardcore database technologists, or others who can hold their own with those folks. Audience participation can be intense; the most talkative members I can recall were Mike Stonebraker, Martin Kersten, Michael McIntire, and myself. Even the vendor folks tend to the smart — past examples include Stephen Brobst, Jeff Hammerbacher, Luke Lonergan, and IBM Fellow Laura Haas. When we had a datageek bash on my last trip to the SF area, several guys said they were planning to attend XLDB as well.

XLDB stands for eXtremely Large DataBases, and those are indeed what gets talked about there. Read more

September 11, 2011

“Big data” has jumped the shark

I frequently observe that no market categorization is ever precise and, in particular, that bad jargon drives out good. But when it comes to “big data” or “big data analytics”, matters are worse yet. The definitive shark-jumping moment may be Forrester Research’s Brian Hopkins’ claim that:

… typical data warehouse appliances, even if they are petascale and parallel, [are] NOT big data solutions.

Nonsense almost as bad can be found in other venues.

Forrester seems to claim that “big data” is characterized by Volume, Velocity, Variety, and Variability. Others, less alliteratively-inclined, might put Complexity in the mix. So far, so good; after all, much of what people call “big data” is collections of disparate data streams, all collected somewhere in a big bit bucket. But when people start defining “big data” to include Variety and/or Variability, they’ve gone too far.

Read more

July 5, 2011

Eight kinds of analytic database (Part 2)

In Part 1 of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I’ll cover four more kinds of analytic database — even newer, for the most part, with a use case/product short list match that is even less clear.  Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.