MongoDB is growing up
I caught up with my clients at MongoDB to discuss the recent MongoDB 2.6, along with some new statements of direction. The biggest takeaway is that the MongoDB product, along with the associated MMS (MongoDB Management Service), is growing up. Aspects include:
- An actual automation and management user interface, as opposed to the current management style, which is almost entirely via scripts (except for the monitoring UI).
- That’s scheduled for public beta in May, and general availability later this year.
- It will include some kind of integrated provisioning with VMware, OpenStack, et al.
- One goal is to let you apply database changes, software upgrades, etc. without taking the cluster down.
- A reasonable backup strategy.
- A snapshot copy is made of the database.
- A copy of the log is streamed somewhere.
- Periodically — the default seems to be 6 hours — the log is applied to create a new current snapshot.
- For point-in-time recovery, you take the last snapshot prior to the point, and roll forward to the desired point.
- A reasonable locking strategy!
- Document-level locking is all-but-promised for MongoDB 2.8.
- That means what it sounds like. (I mention this because sometimes an XML database winds up being one big document, which leads to confusing conversations about what’s going on.)
- Security. My eyes glaze over at the details, but several major buzzwords have been checked off.
- A general code rewrite to allow for (more) rapid addition of future features.
Of course, when a DBMS vendor rewrites its code, that’s a multi-year process. (I think of it at Oracle as spanning 6 years and 2 main-number releases.) With that caveat, the MongoDB rewrite story is something like:
- Updating has been reworked. Most of the benefits are coming later.
- Query optimization and execution have been reworked. Most of the benefits are coming later, except that …
- … you can now directly filter on multiple indexes in one query; previously you could only simulate doing that by pre-building a compound index.
- One of those future benefits is more index types, for example R-trees or inverted lists.
- Concurrency improvements are down the road.
- So are rewrites of the storage layer, including the introduction of compression.
Also, you can now straightforwardly transform data in a MongoDB database and write it into new datasets, something that evidently wasn’t easy to do before.
One thing that MongoDB is not doing is offer any ODBC/JDBC or other SQL interfaces. Rather, there’s some other API — I don’t know the details — whereby business intelligence tools or other systems can extract views, and a few BI vendors evidently are doing just that. In particular, MicroStrategy and QlikView were named, as well as a couple of open source usual-suspects.
As of 2.6, MongoDB seems to have a basic integrated text search capability — which however does not rise to the search functionality level that was in Oracle 7.3.2. In particular:
- 15 Western languages are supported with stopwords, tokenization, etc.
- Search predicates can be mixed into MongoDB queries.
- The search language isn’t very rich; for example, it lacks WHERE NEAR semantics.
- You can’t tweak the lexicon yourself.
And finally, some business and pricing notes:
- Two big aspects of the paid-versus-free version of MongoDB (the product line) are:
- Security.
- Management tools.
- Well, actually, you can get the management tools for free, but only on a SaaS basis from MongoDB (the company).
- If you want them on premises or in your part of the cloud, you need to pay.
- If you want MongoDB (the company) to maintain your backups for you, you need to pay.
- Customer counts include:
- At least 1000 or so subscribers (counting by organization).
- Over 500 (additional?) customers for remote backup.
- 30 of the Fortune 100.
And finally, MongoDB did something many companies should, which is aggregate user success stories for which they may not be allowed to publish full details. Tidbits include:
- Over 100 organizations run clusters with more than 100 nodes. Some clusters exceed 1,000 nodes.
- Many clusters deliver hundreds of thousands of operations per second (combined read and write).
- MongoDB clusters routinely store hundreds of terabytes, and some store multiple petabytes of data. Over 150 clusters exceed 1 billion documents in size. Many manage more than 100 billion documents.
Comments
16 Responses to “MongoDB is growing up”
Leave a Reply
Anything on their relationship with TokuMX?
pitty they didn’t think about a Caching mechanism
Last I heard, MongoDB (the company) wasn’t friendly to TokuMX (the product). I think that was late last year, but I’m not aware of anything changing.
What is mongodb.. anyother relationship between database management system.. please let me know?
MongoDB is by most measures the most successful NoSQL data manager, with a data model based on the JSON data interchange standard. It’s open source, but developed primarily by one company, which recently changed its name from 10gen to MongoDB.
[…] DBMS2: MongoDB is growing up […]
Curt,
Not to belabor the point, but which security buzzwords have you checked off?
I have read about the new Mongo “field-level redaction”, and the implementation is interesting, to say the least:
http://docs.mongodb.org/manual/reference/operator/aggregation/redact/#pipe._S_redact
It’s a step in the right direction, for sure. But I would also say that it’s far from filling in the box on an Enterprise security checklist.
I don’t work for MongoDB, except as an analyst/consultant, so I haven’t personally checked off much of anything. And perhaps I nodded my head a bit too quickly at seeing mention of FIPS.
But they’ve certainly take some substantive security steps.
Still can’t believe these guys keep shipping the default install listening on *all* network interfaces. I major security flaw imho. 🙁
What about the dreaded global write lock?
Is it still there or finally gone?
See the first point in my post under “A reasonable locking strategy” 🙂
@norbert – mongodb reads and writes data to memory-mapped files. many deployments use mongodb for persistence and cache.
@joe – here’s a new paper that covers features, integrations and best practices related to security: http://info.mongodb.com/rs/mongodb/images/MongoDB_Security_Architecture_WP.pdf
@tim – official .deb and .rpm packages have the bind_ip configuration set to 127.0.0.1 by default. more here: http://docs.mongodb.org/manual/administration/configuration/#configure-the-database
[…] I talked about when I visited MongoDB is confidential; the public stuff was mainly in my recent MongoDB technology post. But in one exception, I asked Max for an update as to MongoDB enterprise use cases. He reported a […]
[…] and Variability have been solved. MongoDB, Cassandra and perhaps others are strong NoSQL choices. Schema-on-need is in earlier days, but may […]
[…] No change in read performance (which however was boosted in MongoDB 2.6). […]
[…] neglected to ask why this changed from MongoDB’s adamantly non-SQL approach of 2 1/2 years […]