February 10, 2014
MemSQL 3.0
Memory-centric data management is confusing. And so I’m going to clarify a couple of things about MemSQL 3.0 even though I don’t yet have a lot of details.* They are:
- MemSQL has historically been an in-memory row store, which as of last year scales out.
- It turns out that the MemSQL row store actually has two table types. One is scaled out. The other — called “reference” — is replicated on every node.
- MemSQL has now added a third table type, which is columnar and which resides in flash memory.
- If you want to keep data in, for example, both the scale-out row store and the column store, you’d have to copy/replicate it within MemSQL. And if you wanted to access data from both versions at once (e.g. because different copies cover different time periods), you’d likely have to do a UNION or something like that.
*MemSQL’s first columnar offering sounds pretty basic; for example, there’s no columnar compression yet. (Edit: Oops, that’s not accurate. See comment below.) But at least they actually have one, which puts them ahead of many other row-based RDBMS vendors that come to mind.
And to hammer home the contrast:
- IBM, Oracle and Microsoft, which all sell row-based DBMS meant to run on disk or other persistent storage, have added or will add columnar options that run in RAM.
- MemSQL, which sells a row-based DBMS that runs in RAM, has added a columnar option that runs in persistent solid-state storage.
Categories: Columnar database management, Database compression, In-memory DBMS, MemSQL, Solid-state memory
Subscribe to our complete feed!
Comments
12 Responses to “MemSQL 3.0”
Leave a Reply
MemSQL’s column store most certainly has compression.
Check out some of the features of MemSQL 3.0 here: http://www.memsql.com/technology/
-Eric Frenkiel, CEO MemSQL
Eric,
Thanks!
Actually, I never said MemSQL didn’t have compression. I just made accurate reference to what was said in my last phone call with your guys, then completely forgot that you’d sent an email correcting the error. My bad! 🙁
Link less to your own articles and more to that actual item of interest (MemSQL). Shouldn’t need a link in the comments to actually get there.
A,
I’m terribly sorry that this blog — which I provide entirely for free and at my own expense — doesn’t meet your requirements. Perhaps some other one would suit you better.
Curt, Soundslike memsql is ROWSTORE in memory and then converts into columnar storage on disk? if thats the case, whats the query engine optimized for ? columnar or row?
Don’t think any one vendor has been able to do a single query engine for both row and column just yet. If you just store columnar and has row based query engine then memsql is probably a wannabe columnar.
John,
Be a little careful when you say that a single query engine can’t handle both row and column operations. When it comes to query planning (including optimization), I think the problems are generally surmountable. Ditto ancillary features such as backup and so on.
Caching is a tougher one, but MemSQL is a special case in that regard as 2 of its 3 table types are fully in-memory anyway.
All that said, MemSQL’s columnar features are new and immature. Any stereotypical doubts you have around maturity I’d be likely to share.
Curt, I would humbly disagree with you on this one. Greenplum adding columnar storage to their Row base database didn’t really make them columnar compared to Vertica/ParAccel as an example. I understand the immaturity of memsql columnar but query executions of row stores on column storage don’t yield the best results. In any case, as usual you have done an impressive job by providing perspective on such technologies and appreciate that.
John,
What you’re describing as Greenplum weaknesses falls within the scope of what I was suggesting.
Even an elementary column store has reduced I/O — compression aside — plus the tendency to compress better than a row store (assuming the same compression algorithms, especially columnar compression algorithms). Saying “Yes, but Greenplum still gets smoked by Vertica and ParAccel” doesn’t contradict anything I’ve said.
Curt, John,
wrt, “single query engine for both row & column store”, DB2 BLU has a single engine which stores both ROW or COLUMN store tables and supports queries on them, even joins. See the paper at: http://researcher.watson.ibm.com/researcher/files/us-ipandis/vldb13db2blu.pdf
Even in Informix with Warehouse Accelerator, presence of two engines is transparent to the application. Acceleration & query compensation happens automatically & transparently.
Keshav,
The question isn’t transparency; MemSQL has that, and so do Greenplum, Aster et al. It’s whether each part is as good as one would expect a dedicated product to be.
My take on BLU was http://www.dbms2.com/2013/05/27/ibm-blu/
[…] of memory-centric DBMS flag-wavers MemSQL, Aerospike, and SAP HANA Categories: In-memory DBMS, NewSQL, NoSQL, OLTP, SAP AG […]
[…] … which was the point of introducing MemSQL’s flash-based columnar option. […]