IBM is buying parallelization expert Platform Computing
IBM is acquiring Platform Computing, a company with which I had one briefing, last August. Quick background includes:
- Platform Computing started ~20 years ago.
- Platform Computing claimed close to $100 million in revenue and >500 people.
- (This is Platform Computing’s most famous splash to date.) Platform Computing technology underlies SAS Institute’s preferred method of parallelization, which may variously be called:
- SAS Grid Manager (the more or less official brand name).
- SAS HPA (High Performance Analytics), sort of an alternate brand name.
- MPI (Message Passing Interface), the industry’s name for the underlying semantics/syntax/API.
- Platform Computing’s original business was scientific grid computing.
- Platform Computing’s second major business was its “Symphony” product line. According to Platform Computing, Symphony:
- Debuted 6-7 years ago.
- Is more commercially oriented.
- Is what supports SAS HPA.
- SAS aside, has been sold to Wall Street and so on.
- Is sometimes used in conjunction with CEP/streaming, mainly for backtesting.
- Can be used to build global (parallel) persistent memory for R.
- (This is probably why IBM is buying Platform Computing.) Platform Computing’s has a new MapReduce offering that:
- Is based on Symphony.
- Shipped last July, except that early access was a couple months before that.
- Is focused on:
- Lowering the latency of MapReduce.
- Consolidating multiple MapReduce use cases into one high(er)-utilization cluster.
- Offering workload management in support of those goals.
- Reliability, availability, predictability, puppies, kittens, and apple pie.
- Is most specifically a MapReduce run-time engine, with other stuff beyond that.
Unfortunately, I’m not precisely clear as to how tied this offering is to Hadoop, but using it with Hadoop is at least the base case. But Platform Computing did say:
- It can support multiple virtual Hadoop clusters, which can be grown or shrunk at will.
- Non-Hadoop workloads can be mixed in.
Platform Computing said that key technical benefits of this offering included:
- 1-3 seconds to start a job, vs. 40-50 in generic Hadoop.
- Automatic recovery of JobTracker nodes.
- Failover for NameNodes.
- Workload management that:
- Manages all of CPU, I/O, and RAM (this is quickly becoming an industry standard level of capability, although I’m judging more by the standards of the analytic DBMS world).
- Monitors but doesn’t actively manage network resources.
- Can reprioritize jobs that are in flight. (Also an industry-standard capability.)
This conflation of scientific, commercial analytic, streaming, and MapReduce is right in IBM’s philosophical wheelhouse. I base that comment on, among other factors:
- How IBM positions “Big Insights”.
- IBM’s “smart consolidation” picture/pitch (which I really should get around to posting).
- The fuss IBM makes about Watson, Blue Gene, and so on.
The IBM acquisition probably obviates a lot of Platform Computing’s previous business comments, but at the time they included:
- POCs (Proofs of Concept):
- Mainly in financial services, government, and telecom.
- At both existing customers and new prospects.
- Typically running 30-50 nodes, 2-50 terabytes.* The smallest databases evidently tended to be an financial services firms.
- Pricing that was starting out:
- Perpetual license: $3450/server, 21% annual maintenance after the first year.
- Subscription: $2070/server annually, or $3070 with HDFS support bundled in.
*1 terabyte or less per node is probably the lowest data-per-node figure I’ve heard for anything Hadoop-like — even below Hadapt, and well below what Cloudera and Hortonworks usually see.
Comments
5 Responses to “IBM is buying parallelization expert Platform Computing”
Leave a Reply
Hi Curt; Symphony is not related to SAS’s HPA.
We do use (and enjoy) the LSF and related components to enable the SAS Grid Manager strategy where we use grid/divide-and-conquer techniques at the SAS 4GL level.
SAS HPA is where we use Massively Parallel coding and an MPI connection fabric to use divide-and-conquer techniques in the underlying implementation of the 4GL components.
Regards,
Paul
Thanks for the clarification, Paul!
Except … I’m unclear on what “LSF” stands for. 🙂
LSF is another of the products in the Platform Portfolio. http://www.platform.com/workload-management
LSF is your more traditional grid workload management with queues and resources and priority based scheduling (and platform has pretty nice data-aware scheduling too)
Symphony I’m not so familiar with. I think its for scheduling and workload-management of more finer-grained units of work.
[…] out the DBMS2 and Wired […]
Symphony is scheduler + middleware for SOA applications. It has two levels of scheduling functions: (1) within application: schedules messages from client to servers (2) among applications: scheduling services.