Amazon SimpleDB – when less is, supposedly, enough
I’ve posted several times about Amazon as an innovative, super-high-end user — doing transactional object caching with ObjectStore, building an inhouse less-than-DBMS called Dynamo, or just generally adopting a very DBMS2-like approach to data management. Now Amazon is bring the Dynamo idea to the public, via a SaaS offering called SimpleDB. (Hat tip to Tim Anderson.)
SimpleDB is obviously meant to be a data server for online applications. There are no joins, and queries don’t run over 5 seconds, so serious analytics are out of the question. Domains are limited to 10GB for now, so extreme media file serving also isn’t what’s intended; indeed, Amazon encourages one to use SimpleDB to store pointers to larger objects stored as files in Amazon S3.
On the other hand, if you think of SimpleDB as an OLTP DBMS, your head might explode. There’s no sense of transaction, no mechanisms to help with integrity, no way to do arithmetic, and indeed no assurance that writes will be immediately reflected in reads. Here’s the skinny:
To use Amazon SimpleDB you:
* CREATE a new domain to house your unique set of structured data.
* GET, PUT or DELETE items in your domain, along with the attribute-value pairs that you associate with each item. Amazon SimpleDB automatically indexes data as it is added to your domain so that it can be quickly retrieved; there is no need to pre-define a schema or change a schema if new data is added later. Each item can have up to 256 attribute values. Each attribute value can range from 1 to 1,024 bytes.
* QUERY your data set using this simple set of operators: =, !=, < , > < =, >=, STARTS-WITH, AND, OR, NOT, INTERSECTION AND UNION. Query execution time is currently limited to 5 seconds. Amazon SimpleDB is designed for real-time applications and is optimized for those use cases.
* Pay only for the resources that you consume.
I am, to put it mildly, not a purist about insisting that every traditional feature of transactional DBMS be available to support every OLTP application. But I don’t see how SimpleDB would be useful for something that involves, say, buying and selling, unless it’s a staging area for data on its way into or out of real DBMS. The examples provided in the docs, however, are all about merchandise. Weird.
Edit: Techcrunch is frothing with glee over SimpleDB. And in the comment thread, the author makes it clear that he is NOT being facetious.
Further edit: High Scalability has a long list of links discussing SimpleDB. One, er, simple and informative one comes from Sriram Krishnan.
Comments
6 Responses to “Amazon SimpleDB – when less is, supposedly, enough”
Leave a Reply
Curt,
This is aimed at web-centric developers who don’t use the features of the DBMS in any case. E.g. the Web2.0 crrowd who are using toolsets like Ruby on Rails and PHP.
I’m not saying it’s a good idea but a lot of the current technologies encourage the developer to treate the database as a glorified card store. Which is exactly what SimpleDB is.
I often see tips on ways to remove or avoid “inconvenient” or “slow” features of the DB like referential integrity and transactions!
Joe
Joe,
Fair enough. But can you think of any nontrivial application categories where this approach actually makes sense?
I’m all in favor of schema flexibility, and of doing integrity checks programmatically rather than declaratively when the tradeoffs justify it. But in SimpleDB I don’t see an easy way to code the integrity checks, I don’t see an easy way to assure transaction integrity, and I don’t see an easy way to query the data.
What am I missing?
CAM
I don’t think SimpleDB is as much of a “database” as it is a distributed cache. The reason its so simple is because its hard to do DBMS on a large distributed cluster. The disadvantage is you don’t get the richness of a traditional database, but you do gain performance and scalability. Try having 1000 clients (web servers) connected to a traditional database.
They are certainly missing some things, and transactions will be vital… But the ability to store sparse data sets in a scalable, backed-up store, even if that does mean doing some work on the client once you get the data back, can be worth it in many scenarios.
[…] a post earlier tonight about Amazon’s new SimpleDB, I suggested that SimpleDB’s main use might be as the database engine behind other S3/EC2 […]
Can anyone tell me what DBMS Amazon uses for checkout? By that I mean for processing the actual financial transaction, for example, of buying some books or CDs?
Thanks
[…] Looking to the future, Frank Buytendijk uses some creativity to predict The Future of IT: Search. Kaj Arno announces that the 2008 MySQL User Conference Registration Opened. And if you Want to use MySQL 5.1? A Bug List Made For You… by Jay Pipes. If you are still stuck in the past, you can read about Keith Murphy‘s experience Upgrading from 4.1 to 5.0. Or, if you’re already living in the future, you may want to read Rajender Singh‘s article on How to find a trace file Oracle 11g. Curt Monash does not think the future of DBMS is Amazon SimpleDB. […]