Quick introduction to Schooner Information Technology appliances
Back in August I talked with John Busch of Schooner Information Technology, which has a non-obvious URL. Schooner Information Technology sells Flash-based appliances that are mainly intended to run MySQL with blazing write performance.
This is one of those cases in which I warned that due to my September wave of family health issues I would cut a few blogging corners, so:
- I’m only going to write about the MySQL aspect, even though Schooner has a memcached product and claims to be able to run other NoSQL stuff as well.
- I’m not going to dig for company information beyond recalling:
- Schooner said that it has invested $20 million in R&D.
- Schooner’s appliances are resold by IBM.
- Schooner also has a direct sales force.
- One flagship customer had 30 TB of data on 17 Schooner nodes.
If Schooner wants to add some of what I’ve left out into the comments to this post, that would be great.
Schooner appliances are meant to be clustered, and “linear” scalability is claimed. Updates go to RAM cache, and are immediately sent to other RAM caches in the cluster as well. Relying on the safety provided by synchronous replication, Schooner appliances gain performance by writing to Flash only in a lazy, block-at-a-time manner. Beyond that, everything I picked up about the Schooner architecture was specific to individual nodes, including:
- John cited two of the tough points of Schooner’s design as being:
- Enough parallelism to drive the Flash IOPS (Input/Output Per Second)
- Concurrency/non-blocking threads/affinity management
- As you’ve probably already picked up above, Schooner batches all interrupts.
- In particular, Schooner batches writes into 4Kb-64Kb blocks, even if the application thinks they’re only 64 bytes or whatever. This mapping is done in DRAM.
- Schooner neither clusters nor sorts anything by key value. (That leaves a huge open question as to how anything ever gets retrieved with any efficiency.)
- Schooner’s block-writing strategies include:
- Obviously, batch-at-a-time writes are ideal for wear-leveling.
- Since it takes half a second or so to erase a Flash block, this is done before the block is needed. Of course, when Schooner erases a block it rewrites its contents elsewhere.
- Eventually, Schooner will erase and rewrite all blocks, even relatively clean ones. This is its compaction process.
- Transaction logs are written to disk.
- All of the above (especially the block placement and concurrency) required a lot of InnoDB hacking.
Comments
4 Responses to “Quick introduction to Schooner Information Technology appliances”
Leave a Reply
Curt, Schooner uses standard of the shelf flash drives and particularly if I remember correctly an Intel-brand. Those come with embedded flash-management firmware that already do compactization and all the rest of flash-management stuff you have attributed to schooner software. For example how could “block-placement strategy” be done on top of SSD?
Camuel,
That’s a great question for somebody at Schooner such as John Busch. 🙂
[…] Schooner, Kaminario makes no exceptions for transaction logs and the like. Kaminario K2 is just a block […]
As Monash points out, the key to effectively utilizing flash memory is have sufficient parallelism and granular concurrency control in the application software and operating environment to effectively utilize the abundant IOPS that flash affords. This is how we overcome the I/O bottleneck which enables us to fully utilize multi-core processors. The net result is a balanced system, which results in order of magnitude improvements in throughput, power and space for the data center data access tier (MySQL in particular).
The new generation of flash technology, for example SSDs and PCI-E cards based on Sand Force flash controllers, provide excellent block management, write coalescing, compression, and IOPS at a very reasonable $/GB. This mitigates the need for host-level flash space optimization, so optimizations of this kind have second order impact.
Please take a look at my blog at http://www.schoonerinfotech.com/blog/ where we present Schooner Labs research results. In that blog we show component and system level measurements which highlight the key effects of software algorithms and first vs second generation flash technologies for database systems.
Looking forward to your feedback.
John