May 7, 2010

Clarifying the state of MPP in-database SAS

I routinely am briefed way in advance of products’ introductions. For that reason and others, it can be hard for me to keep straight what’s been officially announced, introduced for test, introduced for general availability, vaguely planned for the indefinite future, and so on. Perhaps nothing has confused me more in that regard than the SAS Institute’s multi-year effort to get SAS integrated into various MPP DBMS, specifically Teradata, Netezza Twinfin(i), and Aster Data nCluster.

However, I chatted briefly Thursday with Michelle Wilkie, who is the SAS product manager overseeing all this (and also some other stuff, like SAS running on grids without being integrated into a DBMS). As best I understood, the story is: Read more

Categories: Aster Data, Data warehouse appliances, MapReduce, Netezza, Parallelization, Predictive modeling and advanced analytics, SAS Institute, Specific users, Teradata

11 Comments

April 29, 2010

Vertica update

Last month, Vertica’s CEO Ralph Breslauer quit,* and Vertica made it sound like there would be a new CEO late in April. And indeed, as of April 29, there was. He’s a guy I’ve never heard of before named Chris Lynch, apparently quite the sales machine builder. The most substance I’ve found is a pair of Mass High Tech articles — the latter exceedingly typo-ridden — to the general effect that:

Vertica plans to build a massive, world-conquering sales force.
If Vertica dips back into negative cash flow to do that and has to raise more venture capital, so be it.
“Triple-digit” revenue growth is expected for this year.

1 Comment

April 12, 2010

Greenplum Chorus and Greenplum 4.0

Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year’s EDC (Enterprise Data Cloud) vision statement and marketing campaign.

Greenplum 4.0 highlights and related observations include: Read more

Categories: Analytic technologies, Benchmarks and POCs, Data integration and middleware, Data warehousing, EAI, EII, ETL, ELT, ETLT, Greenplum, Market share and customer counts, Petabyte-scale data management, Specific users, Telecommunications, Theory and architecture

5 Comments

April 8, 2010

Information found in public-facing social networks

Here are some examples illustrating two recent themes of mine, namely:

Easily-available information reveals all sorts of things about us.
Graph-based analysis is on the rise.

Pete Warden scraped all of Facebook’s social graph (at least for the United States), and put up a really interesting-looking visualization of same. Facebook’s lawyer’s came down on him, and he quickly agreed to destroy the data he’d scraped, but also published ideas on how other people could duplicate his work.

Warden has since given an interview in which he outlines some of the things researchers hoped to do with this data: Read more

Categories: Analytic technologies, Facebook, RDF and graphs, Surveillance and privacy

1 Comment

March 27, 2010

Quick news, links, comments, etc.

Some notes based on what I’ve been reading recently: Read more

Categories: Akiban, Analytic technologies, Data warehousing, EMC, Exadata, Fox and MySpace, Games and virtual worlds, Groovy Corporation, IBM and DB2, Open source, Oracle, SAP AG, Theory and architecture

Some business trends in the data warehouse market

In recent conversations with various analytic DBMS vendors, a fairly consistent picture has emerged.

Business is strong. Multiple vendors claim to be going gangbusters, with the happy sounds coming out of Vertica and Infobright being echoed by several competitors. Hearsay suggests some other companies in related businesses are doing well too. Depending on who you talk to, the business pickup dates back to Q4, give or take a quarter.
Oracle Exadata has become a formidable competitor, on the strength of Exadata 2. Exadata 2’s positioning and perception among Oracle users seem to be pretty much in line with what Oracle portrayed to me.
Teradata is portrayed as a weak competitor. Competitors don’t worry about Teradata nearly as much as they do about Oracle. That said, I suspect a bit of wishful thinking; Teradata is clearly still getting a lot of business the other vendors would dearly love to have.
HP Neoview is reeling. (Almost) nobody sees Neoview competitively. The Walmart Neoview installation is said to have stayed small at best. JP Morgan Chase is said to have completely thrown Neoview out (and a bunch of HP engineers with it).
(Almost) nobody mentions competing against DB2 either. This continues to baffle me.

Categories: Analytic technologies, Data warehousing, Exadata, HP and Neoview, IBM and DB2, JPMorgan Chase, Market share and customer counts, Oracle, Teradata

4 Comments

March 2, 2010

Cassandra and the NoSQL scalable OLTP argument

Todd Hoff put up a provocative post on High Scalability called MySQL and Memcached: End of an Era? The post itself focuses on observations like:

Facebook invented and is adopting Cassandra.
Twitter is adopting Cassandra.
Digg is adopting Cassandra.
LinkedIn invented and is adopting Voldemort.
Gee, it seems as if the super-scalable website biz has moved beyond MySQL/Memcached.

But in addition, he provides a lot of useful links, which DBMS-oriented folks such as myself might have previously overlooked. Read more

Categories: Cassandra, Data models and architecture, NoSQL, OLTP, Open source, Parallelization, Specific users, Theory and architecture

16 Comments

October 27, 2009

Teradata’s nebulous cloud strategy

As the pun goes, Teradata’s cloud strategy is – well, it’s somewhat nebulous. More precisely, for the foreseeable future, Teradata’s cloud strategy is a collection of rather disjointed parts, including:

What Teradata calls the Teradata Agile Analytics Cloud, which is a combination of previously existing technology plus one new portlet called the Teradata Elastic Mart(s) Builder. (Teradata’s Elastic Mart(s) Builder Viewpoint portlet is available for download from Teradata’s Developer Exchange.)
Teradata Data Mover 2.0, coming “Soon”, which will ease copying (ETL without any significant “T”) from one Teradata system to another.
Teradata Express DBMS crippleware (1 terabyte only, no production use), now available on Amazon EC2 and VMware. (I don’t see where this has much connection to the rest of Teradata’s cloud strategy, except insofar as it serves to fill out a slide.)
Unannounced (and so far as I can tell largely undesigned) future products.

Teradata openly admits that its direction is heavily influenced by Oliver Ratzesberger at eBay. Like Teradata, Oliver and eBay favor virtual data marts over physical ones. That is, Oliver and eBay believe that the ideal scenario is that every piece of data is only stored once, in an integrated Teradata warehouse. But eBay believes and Teradata increasingly agrees that users need a great deal of control over their use of this data, including the ability to import additional data into private sandboxes, and join it to the warehouse data already there. Read more

Categories: Analytic technologies, Cloud computing, Data integration and middleware, Data warehousing, EAI, EII, ETL, ELT, ETLT, eBay, Teradata, Theory and architecture

5 Comments

October 18, 2009

General introduction to Splunk

I dropped by log analysis software vendor Splunk a few weeks ago for a chat with Marketing VP Steve Sommer (who some you may know from Cognos and/or Informix), Product Management VP Christina Noren, and above all co-founder/CTO Erik Swan. Splunk turns out to be a pretty interesting company, from both business and technical standpoints. For one thing, Splunk seems highly regarded by most people I mention it to.

Splunk’s technical stories include:

Text search over log files.
Business intelligence over text search. (That part sounds a lot like Attivio.)
MapReduce with schema flexibility and smart multi-stage execution plans. (That part sounds a lot like Aster Data.)

Issues in scientific data management

In the opinion of the leaders of the XLDB and SciDB efforts, key requirements for scientific data management include:

A data model based on multidimensional arrays, not sets of tuples
A storage model based on versions and not update in place
Built-in support for provenance (lineage), workflows, and uncertainty
Scalability to 100s of petabytes and 1,000s of nodes with high degrees of tolerance to failures
Support for “external” data objects so that data sets can be queried and manipulated without ever having to be loaded into the database
Open source in order to foster a community of contributors and to insure that data is never “locked up” — a critical requirement for scientists

However: Read more

Categories: Analytic technologies, Data integration and middleware, Data warehousing, EAI, EII, ETL, ELT, ETLT, Facebook, GIS and geospatial, Hadoop, Open source, SciDB, Scientific research, Specific users, Web analytics

7 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in