January 30, 2015

Growth in machine-generated data

In one of my favorite posts, namely When I am a VC Overlord, I wrote:

I will not fund any entrepreneur who mentions “market projections” in other than ironic terms. Nobody who talks of market projections with a straight face should be trusted.

Even so, I got talked today into putting on the record a prediction that machine-generated data will grow at more than 40% for a while.

My reasons for this opinion are little more than:

Moore’s Law suggests that the same expenditure will buy 40% or so more machine-generated data each year.
Budgets spent on producing machine-generated data seem to be going up.

I was referring to the creation of such data, but the growth rates of new creation and of persistent storage are likely, at least at this back-of-the-envelope level, to be similar.

Anecdotal evidence actually suggests 50-60%+ growth rates, so >40% seemed like a responsible claim.

Related links

My recent survey of machine-generated data topics started with a list of many different kinds of the stuff.
My 2009 post on data warehouse volume growth makes similar points, and notes that high growth rates mean we likely can never afford to keep all machine-generated data permanently.
My 2011 claim that traditional databases will migrate into RAM is sort of this argument’s flipside.

Categories: Market share and customer counts

Subscribe to our complete feed!

Comments

7 Responses to “Growth in machine-generated data”

David Gruzman on February 1st, 2015 9:45 am

I am wondering, if all this data is valuable enough to be stored? For example it might be hard to justify storage of temperature sensors data in one minute resolution for more then a few weeks weeks.
In other words – I am not sure that growing of amount of produced data will be reflected into growing amount of stored and analyzed data.
Curt Monash on February 1st, 2015 5:58 pm

David,

It is very unlikely to all be stored. We couldn’t pay for storing it all today. As storage gets cheaper (Moore’s Law/Kryder’s Law), volumes will increase further (Moore’s Law/subject of this post). So if we can’t afford to keep everything now, we also won’t be able to afford doing so in the future.

That said, 1 minute temperature readings aren’t the best example, because those don’t really take a lot of volume.
Peter Fretty on February 2nd, 2015 12:25 pm

I wouldn’t doubt the 40 percent growth rate considering the number of connected machines now generating data that was not previously collected. I agree with David that if the data isn’t store for future analysis, how are you leveraing value in the data? Is it immediately analyzed? Are summaries created and stored from collected data?

Peter Fretty, IDG blogger posting on behalf of SAS
joseph on February 2nd, 2015 8:01 pm

One factor is affordability. Storage gets cheaper every year and we need to find a way to utilize them
Curt Monash on February 3rd, 2015 11:20 am

Peter,

If the data isn’t all being stored, then summaries, highlights and/or samples surely should be.

Event detection is one term I’ve heard used in that connection. Another is data reduction, which is a different sense of the term than “choose the most useful variables on which to base a predictive model”.
David Gruzman on February 3rd, 2015 12:34 pm

It would be nice to be able to define “value per GB” measure for different types of data. Having graph of such value together with storage price graph would enable us to predict – what types of data will be stored in the future.
BI for NoSQL — some very early comments | DBMS 2 : DataBase Management System Services on March 17th, 2015 7:40 am

[…] have more data — presumably machine-generated — than you can afford to […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Growth in machine-generated data

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin