OLTP

Analysis of database management systems designed with a focus on OTLP (OnLine Transaction Processing) uses.

June 18, 2012

Introduction to MemSQL

I talked with MemSQL shortly before today’s launch. MemSQL technology basics are:

In-memory relational DBMS.
Being released single-box only. Transparent sharding is under development for release in the fall. Basic replication is under development too.
Subset of SQL-92.
MySQL wire-compatible (SQL coverage issues excepted).

MemSQL’s performance claims include:

Read performance 10% or so worse than memcached.
Write performance 20% or so better than memcached.
1.2 million inserts/second on a 64-core, 1/2 TB of RAM machine.
Similarly, 1/2 billion records loaded in under 20 minutes.

MemSQL company basics include: Read more

Categories: Database compression, In-memory DBMS, Investment research and trading, Market share and customer counts, memcached, MemSQL, OLTP, Pricing, Web analytics

3 Comments

February 26, 2012

SAP HANA today

SAP HANA has gotten much attention, mainly for its potential. I finally got briefed on HANA a few weeks ago. While we didn’t have time for all that much detail, it still might be interesting to talk about where SAP HANA stands today.

The HANA section of SAP’s website is a confusing and sometimes inaccurate mess. But an IBM whitepaper on SAP HANA gives some helpful background.

SAP HANA is positioned as an “appliance”. So far as I can tell, that really means it’s a software product for which there are a variety of emphatically-recommended hardware configurations — Intel-only, from what right now are eight usual-suspect hardware partners. Anyhow, the core of SAP HANA is an in-memory DBMS. Particulars include:

Mainly, HANA is an in-memory columnar DBMS, based on SAP’s confusingly-renamed BI Accelerator/BW Accelerator. Analytics and most OLTP (OnLine Transaction Processing) go against the columnar part of HANA.
The HANA DBMS also has an in-memory row storage option, used to store metadata, small tables, and so on.
SAP HANA talks both SQL and MDX.
The HANA DBMS is shared-nothing across blades or rack servers. I imagine that within an individual blade it’s shared everything. The usual-suspect data distribution or partitioning strategies are available — hash, range, round-robin.
SAP HANA has what sounds like a natural disk-based persistence strategy — logs, snapshots, and so on. SAP says that this is synchronous enough to give ACID compliance. For some hardware partners, those “disks” are actually Fusion I/O cards.
HANA is fault-tolerant “across servers”.
Text support is “coming soon”, which makes sense, given that BI Accelerator was based on the TREX search engine in the first place. Inxight is also in the HANA text mix.
You can put data into SAP HANA in a variety of obvious ways:
- Writing it directly.
- Trigger-based replication (perhaps from the DBMS that runs your SAP apps).
- Log-based replication (based on Sybase Replication Server).
- SAP Business Objects’ ETL tool.

SAP says that the row-store part is based both on P*Time, an acquisition from Korea some time ago, and also on SAP’s own MaxDB. The IBM white paper mentions only the MaxDB aspect. (Edit: Actually, see the comment thread below.) Based on a variety of clues, I conjecture that this was an aspect of SAP HANA development that did not go entirely smoothly.

Other SAP HANA components include: Read more

Categories: Analytic technologies, Business Objects, Data warehouse appliances, Data warehousing, In-memory DBMS, Investment research and trading, OLTP, Predictive modeling and advanced analytics, SAP AG, Software as a Service (SaaS), Text

12 Comments

February 15, 2012

Quick notes on MySQL Cluster

According to the MySQL Cluster home page, today’s MySQL Cluster release has — give or take terminology details 🙂 — added transparent sharding (Edit: Actually, please see the first comment below) and a memcached interface. My quick comments on all this to a reporter a couple of days ago were:

Persistent memcached is a useful thing. Couchbase’s sales illustrate that point: http://www.dbms2.com/2012/02/01/couchbase-update/
MySQL has always given good performance when used just as a key-value store, e.g. http://www.dbms2.com/2010/08/22/workday-technology-stack/ . So it’s reasonable to hope the memcached interface will have good performance out of the box.
MySQL’s clustering capabilities have long been weak, providing a window of opportunity for companies and products such as Schooner Information and dbShards. The gold standard for clustering is:
- Efficient transparent sharding: http://www.dbms2.com/2011/02/24/transparent-sharding/
- Synchronous replication at much better than two-phase-commit speeds. http://www.dbms2.com/2011/10/23/schooner-pivots-further/

I don’t really know enough about MySQL Cluster right now to comment in more detail.

Categories: Clustering, memcached, MySQL, NoSQL, OLTP, Open source

2 Comments

October 23, 2011

Transparent relational OLTP scale-out

There’s a perception that, if you want (relatively) worry-free database scale-out, you need a non-relational/NoSQL strategy. That perception is false. In the analytic case it’s completely ridiculous, as has been demonstrated by Teradata, Vertica, Netezza, and various other MPP (Massively Parallel Processing) analytic DBMS vendors. And now it’s false for short-request/OLTP (OnLine Transaction Processing) use cases as well.

My favorite relational OLTP scale-out choice these days is the SchoonerSQL/dbShards partnership. Schooner Information Technology (SchoonerSQL) and Code Futures (dbShards) are young, small companies, but I’m not too concerned about that, because the APIs they want you to write to are just MySQL’s. The main scenarios in which I can see them failing are ones in which they are competitively leapfrogged, either by other small competitors – e.g. ScaleBase, Akiban, TokuDB, or ScaleDB — or by Oracle/MySQL itself. While that could suck for my clients Schooner and Code Futures, it would still provide users relying on MySQL scale-out with one or more good product alternatives.

Relying on non-MySQL NewSQL startups, by way of contrast, would leave me somewhat more concerned. (However, if their code is open sourced. you have at least some vendor-failure protection.) And big-vendor scale-out offerings, such as Oracle RAC or DB2 pureScale, may be more complex to deploy and administer than the MySQL and NewSQL alternatives.

Categories: Clustering, dbShards and CodeFutures, IBM and DB2, MySQL, NewSQL, NoSQL, OLTP, Open source, Oracle, Parallelization, Schooner Information Technology, Transparent sharding

2 Comments

October 23, 2011

Schooner pivots further

Schooner Information Technology started out as a complete-system MySQL appliance vendor. Then Schooner went software-only, but continued to brag about great performance in configurations with solid-state drives. Now Schooner has pivoted further, and is emphasizing high availability, clustered performance, and other hardware-agnostic OLTP (OnLine Transaction Processing) features. Fortunately, Schooner has some interesting stuff in those areas to talk about.

The short form of the SchoonerSQL (as Schooner’s product is now called) story goes roughly like this:

SchoonerSQL replicates data — synchronously if the replication target is local, asynchronously if it is remote.
Local synchronous replication provides high availability; remote asynchronous replication provides disaster recovery.
SchoonerSQL’s local synchronous replication also provides read scale-out.
Schooner has a partnership with Code Futures/dbShards to provide write scale-out via transparent sharding.
SchoonerSQL has some secret sauce in replication performance. This has the effect of significantly increasing write performance (assuming you were going to replicate anyway), because otherwise you might have to slow down the master server’s write performance so that the slaves can keep up with it.
Schooner believes it still has some single-server performance advantages as well.

Categories: Clustering, dbShards and CodeFutures, MySQL, OLTP, Oracle, Parallelization, Schooner Information Technology

3 Comments

September 30, 2011

Oracle NoSQL is unlikely to be a big deal

Alex Williams noticed that there will be a NoSQL session at Oracle OpenWorld next week, and is wondering whether this will be a big deal. I think it won’t be.

There really are three major points to NoSQL.

Dynamic schemas. This is the only one of the three that truly depends on NoSQL.
Scale-out short-request processing. If you want to scale out efficiently at high request volumes, you’re best off not using all the flexibility SQL/relational DBMS offer. (In particular, you don’t want to do cross-node joins). Not coincidentally, a number of the best scale-out offerings were built to be NoSQL.
Open source. Doing a relational DBMS is a big project. It seems easier to build NoSQL ones.

Oracle can address the latter two points as aggressively as it wishes via MySQL. It so happens I would generally recommend MySQL enhanced by dbShards, Schooner, and/or dbShards/Schooner, rather than Oracle-only MySQL … but that’s a detail. In some form or other, Oracle’s MySQL is a huge player in the scale-out, open source, short-request database management market.

So that leaves us with dynamic schemas. Oracle has at least four different sets of technology in that area:

As Workday noticed years ago, MySQL can be used as a functional, basic key-value store.
Oracle also has XML-based Berkeley DB/SleepyCat kicking around.*
The XML extensions to Oracle’s core DBMS could be alleged to have a dynamic schema/NoSQL flavor. (Blech.)
A dynamic schema argument could also be made for object-oriented DBMS technology. While Oracle doesn’t to my knowledge exactly sell that, it does have the Tangosol Coherence line of technology, with a potentially similar programming model.

If Oracle is now refreshing and rebranding one or more of these as “NoSQL”, there’s no reason to view that as a big deal at all.

*That’s Mike Olson’s former company, if you’re keeping score at home.

Categories: MySQL, NoSQL, Object, OLTP, Open source, Oracle, Parallelization, Schooner Information Technology, Structured documents

13 Comments

September 19, 2011

Are there any remaining reasons to put new OLTP applications on disk?

Once again, I’m working with an OLTP SaaS vendor client on the architecture for their next-generation system. Parameters include:

100s of gigabytes of data at first, growing to >1 terabyte over time.
High peak loads.
Public cloud portability (but they have private data centers they can use today).
Simple database design — not a lot of tables, not a lot of columns, not a lot of joins, and everything can be distributed on the same customer_ID key.
Stream the data to a data warehouse, that will grow to a few terabytes. (Keeping only one year of OLTP data online actually makes sense in this application, but of course everything should go into the DW.)

So I’m leaning to saying: Read more

Categories: Analytic technologies, Cloud computing, Clustering, Data warehousing, dbShards and CodeFutures, Facebook, Infobright, MySQL, OLTP, Open source, Parallelization, Software as a Service (SaaS), Solid-state memory

13 Comments

September 15, 2011

The database architecture of salesforce.com, force.com, and database.com

salesforce.com, force.com, and database.com use exactly the same database infrastructure and architecture. That’s the good news. The bad news is that salesforce.com is somewhat obscure about technical details, for reasons such as:

A long-ago marketing decision to not give infrastructure details, so as to convey a “Don’t worry; we’ll take care of everything” message.
Even so, a long-ago and perhaps now-regretted marketing decision to disclose and even exaggerate salesforce.com’s reliance on Oracle, as part of an early-days attempt to prove salesforce was using enterprise-class technology.
A desire to hide the recipe for salesforce.com’s secret sauce.
Force of habit — I’m not sure salesforce even knows how to tell its technical story with any clarity.

Actually, salesforce.com has moved some kinds of data out of Oracle that previously used to be stored there. Besides Oracle, salesforce uses at least a file system and a RAM-based data store about which I have no details. Even so, much of salesforce.com’s data is stored in Oracle — a single instance of Oracle, which it believes may be the largest instance of Oracle in the world.

Categories: Data models and architecture, Market share and customer counts, Memory-centric data management, Object, OLTP, Oracle, salesforce.com, Software as a Service (SaaS)

19 Comments

September 14, 2011

Kaminario goes (mainly) flash

Kaminario, which used to be in the business of solid state storage via DRAM, now is emphasizing hybrid DRAM/flash storage appliances instead. The reason is evidently price. Per terabyte of primary storage (before mirroring onto disk and so on):

A Kaminario K2 DRAM-only appliance costs $100K.
A Kaminario K2 flash-only appliance costs $30K (but nobody buys that configuration).
A typical Kaminario K2 hybrid DRAM/flash appliance might cost $35K (which tells us that there’s a lot more flash than DRAM).

Kaminario positions DRAM as where you focus your most write-intensive/ bottlenecking loads, such as logging or temp space, with the primary benefit being performance and a secondary benefit being slowing the wear on your flash.

Categories: Kaminario, OLTP, Pricing, Solid-state memory

6 Comments

July 5, 2011

Eight kinds of analytic database (Part 1)

Analytic data management technology has blossomed, leading to many questions along the lines of “So which products should I use for which category of problem?” The old EDW/data mart dichotomy is hopelessly outdated for that purpose, and adding a third category for “big data” is little help.

Let’s try eight categories instead. While no categorization is ever perfect, these each have at least some degree of technical homogeneity. Figuring out which types of analytic database you have or need — and in most cases you’ll need several — is a great early step in your analytic technology planning. Read more

Categories: Analytic technologies, Aster Data, Benchmarks and POCs, Business intelligence, Buying processes, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, Database diversity, Exadata, Greenplum, IBM and DB2, Infobright, Investment research and trading, Log analysis, Microsoft and SQL*Server, MOLAP, Netezza, OLTP, Oracle, ParAccel, Parallelization, Petabyte-scale data management, Predictive modeling and advanced analytics, Pricing, QlikTech and QlikView, SAND Technology, Scientific research, Sybase, Teradata, Vertica Systems, Web analytics, Workload management

7 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in