Interesting trends in database and analytic technology
My project for the day is blogging based on my “Database and analytic technology: State of the union” talk of a few days ago. (I called it that because of when it was given, because it mixed prescriptive and descriptive elements, and because I wanted to call attention to the fact that I cover the union of database and analytic technologies – the intersection of those two sectors is an area of particular focus, but is far from the whole of my coverage.)
One section covered recent/ongoing/near-future trends that I thought were particularly interesting, including:
Simpler database technology, by which I mean DBMS that are:
- Easier to administer than market-leading systems …
- … even if at the cost of being special-purpose
- E.g.,
- MySQL and older mid-tier RDBMS such as Progress
- Many analytic DBMS and appliances, most notably Netezza’s
For general purpose or OLTP uses, I’m not a big fan of MySQL (not enough progress in making it industrial-strength), PostgreSQL (no good company behind it – I’m a non-fan of EnterpriseDB), or Ingres (open source or not, it’s an antiquated system that hasn’t been invested in as much as Oracle, DB2 or SQL Server).
But I get the impression there are a lot of contenders among small startups, featuring very new architectures for OLTP or general-purpose database management. VoltDB comes to mind. NimbusDB is finally within range of getting funded. Dan Weinreb told me Friday he knows of a bunch of others as well. And that’s all before we even get into the NoSQL kind of alternative.
Flexible storage architectures. That’s starting out with an emphasis on hybrid columnar, as in the examples of Vertica and Greenplum. Oracle (to whom I’m under no NDA obligation) and other vendors (to whom I am) are going that way as well.
Multi-tier database architectures, by which I mean at least two things:
- The database tier/server tier split of Exadata
- Hybrid RAM/disk architectures, examples of which include
- Vertica’s RAM-based write-optimized store
- Sensage’s CEP-in-the-DBMS
- This in-memory analytics stuff we keep hearing about from the BI vendors
- Any true in-memory/disk hybrid, such as the regrettably sidelined solidDB
- Smart thinking by numerous DBMS vendors about optimizing the use of RAM and/or Level 2 cache
Netezza is particularly interesting to watch in this regard because it:
- Had a pretty strict storage/other processing split in prior product generations and …
- … ditched that in its latest generation …
- … which however is focused on optimizing the use of RAM cache
Also noteworthy is Petascan, the stealth-mode –and therefore harder to watch right now 🙂 – company I keep teasing about, which makes a strong case for carrying the database/storage tier split into the flash/solid-state memory technology generation. Calpont also has a server/storage tier split, but that’s of mainly theoretical interest unless and until Calpont actually ships an MPP version of InfiniDB.
Cheaper parts, which have of course been a huge trend for decades. Solid-state memory will soon conquer the world. Meanwhile, cheaper sensors drive that machine-generated data I keep talking about.
An ever-better understanding of scale-out technology, in several respects, including:
- Query, notably data movement for MPP DBMS
- Update, especially minimalistic DBMS approaches, be they sharded MySQL or more NoSQLish
- Number-crunching, especially via MapReduce and/or parallel analytic libraries integrated into DBMS
Cool trends I touched on more briefly include:
- More data being available for analysis. This was a core theme of my Enzee Universe keynote speeches; there are also some notes on it in my post based on my Boston Big Data Summit talk.
- More users being served by analytics. Ditto.
- Data exploration/visualization, ala QlikView, Spotfire, or Tableau, and also the faceted stuff.
- The democratization of data mining. But I’m not as sure of that one as of the others…
One area I flat-out forgot to mention is easy data mart spin-out.
Other posts based on my January, 2010 New England Database Summit keynote address
- Data-based snooping — a huge threat to liberty that we’re all helping make worse
- Flash, other solid-state memory, and disk
- Open issues in database and analytic technology
Comments
9 Responses to “Interesting trends in database and analytic technology”
Leave a Reply
Awwww … what do you mean by “no good company behind postgresql ?”
There is *a lot* of good company behind it, not just a single huge good/bad/whatever multinational.
And, of course, a massive opensource community 🙂
[…] talk was on open issues in database and analytic technology. This was closely intertwined with the previous section, and also relied on a lot that I’ve posted here. So I’ll just put up a few notes on […]
Regarding new OLTP architectures, is Akiba one of the new startups you refer to? I have heard of them, but not much info out there about them…
@DR,
Yes. Me too. 🙂
NTT, the largest telco company in japan is using postgreSQL. They have a dedicated development team for applications using postgreSQL. I mean they are a really telco entreprise and using open source dbms, and it works.
NTT, the largest telco company in japan is using postgreSQL. They have a dedicated development team for applications using postgreSQL. I mean they are a really telco enterprise and using open source dbms, and it works.
Farid,
I’m sure they can do many things in PostgreSQL, and I’m glad they are.
However, I suspect there are quite a few that they aren’t and can’t.
Thoughts?
[…] tended to conflate data exploration and data visualization, and I’m far from alone in doing so. But a recent Economist article is a useful reminder that […]
[…] has now joined Greenplum/EMC among row-based analytic DBMS vendors with hybrid row-column stores. Oracle will join them some day, and the same probably applies to other row-based vendors as well. Similarly, Aster Data will […]