Notes on the EMC Greenplum Data Computing Appliance
The big confidential part of my visit last week to EMC’s Data Computing Division, nee’ Greenplum, was of course this week’s announcement of the first EMC/Greenplum “Data Computing Appliance.” Basics include:
- The EMC Greenplum Data Computing Appliance is a Type 2 appliance. Indeed, the specifics of the Data Computing Appliance’s hardware configuration are so unexceptional that they consumed about 1-2 minutes of my multi-hour visit, I didn’t master them in that short time, and I certainly now don’t know what they are.
- The EMC Greenplum Data Computing Appliance just runs Greenplum 4.0. While EMC’s Data Computing Division is signaling that in the future it may be concerned with more than just data warehousing/analytics, the first release of the EMC Greenplum Data Computing Appliance is just as focused on data warehousing and analytics as Greenplum has always been.
- The EMC Greenplum Data Computing Appliance is being rolled out featuring two key integrations with other EMC or EMC-like technology.
- One is with generic storage area networks (SANs), obviously including EMC’s. This relies on the block-level replication that Greenplum introduced in Version 4.0.
- The other is with the Data Domain backup/deduplication/near-line technology EMC acquired a couple of years back. This relies on the parallel load/export/scatter-gather technology Greenplum has had for a while.
The core ideas of Greenplum’s new approach to SAN integration are:
- It’s always necessary to mirror data in a database at least once, for redundancy. At least, that’s what Greenplum does.
- In the Greenplum Data Computing Appliance, the primary copy of the data lives on the appliance itself, but the mirror copies are sent to a SAN, if you want.
- One benefit is that if you have an over-provisioned SAN and believe using a bit more of its capacity is “free,” then this could look to you like a low-cost alternative.
- Another benefit is that if you have a general SAN-based backup/recovery/disaster recovery/etc. strategy, the EMC Greenplum Data Computing Appliance fits right into that.
- By way of contrast, EMC/Greenplum argues that Oracle Exadata does not fit well into a SAN-based backup/recovery/disaster recovery/etc. strategy.
- EMC/Greenplum kept reminding me that this doesn’t just work with EMC SANs but also with, for example, Hitachi SANs.
- However, the EMC Greenplum Data Computing Appliance’s integration with Data Domain is indeed specific to the EMC Data Domain product line, on the theory that EMC Data Domain is overwhelmingly the best product out there.
Obviously, it is not coincidental that this is the kind of story EMC’s sales force can reasonably be expected to do a good job of telling.
The fascinating part about all this is the claim that EMC is in line with CIOs’ strategic technical direction but Oracle isn’t. On the one hand, that’s preposterous, given how central Oracle is to many enterprises’ technology stacks. On the other hand, EMC is right that Oracle is pushing for a much greater degree of overall integration and account control than many enterprises are or should be comfortable with.
Comments
One Response to “Notes on the EMC Greenplum Data Computing Appliance”
Leave a Reply
[…] Data Computing Appliance for a while now, but that launch just took longer than expected. Go read Curt Monash’s notes and the ZDNet article for the available details. Their integration with Data Domain and with SAN as […]