XtremeData announces its DBx data warehouse appliance
XtremeData is announcing its DBx data warehouse appliance today. Highlights include:
- XtremeData is announcing a single pricing metric — $20,000 per terabyte of user data.
- DBx currently has no compression — so when XtremeData adds compression to DBx, price/TB will naturally go down further.
- XtremeData’s DBx node hardware is based on a board that combines an Intel-compatible CPU with some FPGAs, called the XtremeData In-Socket Accelerator (ISA). In addition there is a head node and a data loading node. Failover/high availability for the head node seem to mainly be futures.
- DBx software is based on PostgreSQL. XtremeData says it kept PostgreSQL’s front end and replaced the execution engine. I haven’t checked exactly which PostgreSQL features are in or out of DBx.
- (This subject edited, after Dave DeWitt pointed out how unclear the first version was.) XtremeData’s DBx of course does complete parallel redistribution of data after every intermediate result set. The basic idea of DBx’s data redistribution is that after each intermediate result set, DBx recalculates histograms and redistributes data — and hence work — approximately evenly accordingly. This recalculation is done in the FPGAs. XtremeData claims that this constant re-setting of the execution plan is more extensive, and results in more even data distribution, than rival vendors’ strategies. (Based outside Chicago, XtremeData was founded by high-performance computing (HPC) guys, and its design priorities reflect that. XtremeData’s database management guys, by the way, are in Bangalore.)
- XtremeData’s DBx uses Infiniband to support the resulting large amount of data movement.
- At least in theory, XtremeData’s DBx scales up to 1024 nodes, the limit at which its Infiniband switches can support full bandwidth.
- XtremeData’s smallest DBx product is a half-rack system with 8 nodes, rated at 30 TB of user data.
- Because of the its data redistribution strategy, XtremeData says DBx doesn’t much care about the physical distribution of data. Hash distribution is the default, but it has less benefit than in other MPP analytic DBMS systems.
- XtremeData also claims that DBx is particularly good at being schema-agnostic, in that competing MPP analytic DBMS products shine most for schemas that don’t lead to a lot of data redistribution (e.g., one big fact table, N small dimension tables that can be replicated at each node). However, I’m skeptical about that point. E.g., Oracle Exadata also uses Infiniband, and features no more data redistribution per query than DBx does, so where exactly is the bottleneck Exadata faces but DBx doesn’t?
- XtremeData says it has tested DBx up to 10-15 concurrent queries. There seem to be no workload management features in the first DBx release, but naturally there’s a technical roadmap in that direction.
- XtremeData claims a variety of DBx beta tests, successful proofs-of-concept (POCs), and even customers intending to buy, but no actual sales to date.
XtremeData has kindly permitted me to post its DBx launch slide deck. Three specific POC/prospect price/performance comments may be found on Slide 9.
XtremeData says that the clock speed on the FPGAs it uses is 200 megahertz, clearly much less than an Intel-compatible CPU’s. However, XtremeData also says 100s or 1000s of steps can be done at once on an FPGA. The reason for this seems to be “pipelining” much more than on-chip parallel streams. XtremeData’s explanation seemed to focus on the point that many rows of data could be processed independently of each other, and hence at once. I’m not wholly convinced that this is a standard use of the word “pipelining”. The point may be moot anyway, in that XtremeData’s reported performance advantages are nowhere what one would get by naively assuming DBx can do ~1000 times as many steps per clock cycle at 1/12th – 1/16th a normal clock speed.
Related link
-
XtremeData now has the obvious URL, but at the time of this posting it’s a work in progress.
Comments
34 Responses to “XtremeData announces its DBx data warehouse appliance”
Leave a Reply
Sorry, you lost me after “no compression”. Honestly, I would be pretty skeptical about their price/performance claims if they are operating over uncompressed data.
The price is what it is, but you can go to their site to look at specs to check whether you find it at all plausible.
The performance claims on the slide I called out aren’t really that dramatic.
Dear Curt,
Good overview; interesting product at a compelling price point. Re. the pricing: you talk about ‘user data’ where the slide deck talks about ‘usable data’. In case of the latter, the price/TB would indeed drop with compression, otherwise it won’t I think. Competitors like Vertica price per TB of user (input) data, not stored/usable data. Maybe good to clarify the difference (if there is any)
regards, Jos
The price is “per TB of user data” – input. dbX 1008 has 96 disk drives, 1TB each or 96TB of raw disk. 30TB of that is for “user data”. Priced at $600K, this is exactly $20K/TB of User Data. I hope that helps.
Thanks, Geno.
Jos — XtremeData and I were talking about user data, so I knew that was what they meant. And “usable data” isn’t a common phrase; perhaps you were mixing it up with “usable space”.
An FPGA can achieve pipeline parallelism that is much more effective than the pipeline parallelism that you see on traditional stored program CPUs.
All traditional CPUs are subject to the von Neumann bottleneck because they are instruction flow processors. In order to attempt to limit the effect of the bottleneck, smart instruction flow processors set up a short pipeline, which is usually, at most a few processor instructions long. The CPU does “branch prediction” to figure out the “most likely” pipeline, but things get really expensive when this guess is wrong.
A dataflow processor on the other hand doesn’t have registers and it doesn’t do branch prediction. FPGAs are hardware reconfigurable at run time. The chip may be reprogrammed on the fly to handle the data flow requirements.
Because the pipeline is fixed, there is no branch prediction, and no copying of data in and out of registers. One low megahertz fpga can do the work of dozens or even hundreds of commodity CPUs, because the work is done in pipeline in parallel, like an assembly line.
The analogy to an assembly line is very good actually.
In a normal CPU, one worker repeatedly goes to a workbench for the next part, returns to the workbench, attaches his part and repeats until he has a finished product. In an FPGA the worker sits in place and waits for the work-in-progress and his part-to-add to both arrive. He adds his part, and the work-in-progress then flows to the next worker who does the same. In the same amount of time it took to build one item in the normal CPU, dozens or even hundreds of items are built on the assembly line.
Not another MPP offering!
Looks like it has the same low per TB pricing as Dataupia (~$20K/TB), but starts at 8 nodes and 30TB = $600k entry price.
Postgres and FPGAs like Netezza…could be interesting.
Does this happen even when not logically required e.g. for colocated fact:fact joins:
“XtremeData’s DBx of course does complete parallel redistribution of data after every intermediate result set.”
@Paul: Not another MPP offering!
I’m sorry I had to LOL at that one 🙂 Tell you what Paul, you want an SMP offering for 25K up to 1TB (real data, no indexing, uncompressed)? Pull it off our website! (We’re not in Bangalore either btw — sorry but that doesn’t give me a warm feeling if I’m a US customer!)
There is no need for all this h/w acceleration – it’s a software problem, not a silicon one! That’s why no one has been able to come up with anything more interesting than columns, grids or chips in the past decades. These are all one-trick ponies, to rehash my latest posting on http://jeromepineau.blogspot.com
We believe it is both: a software and a hardware problem. In fact, a FPGA can process “real SQL” at 10X the rate and 1/3rd the power of current CPU. CPUs are still needed for many things, but when used properly the FPGA “In Socket Accelerator” is never the bottleneck in the system. dbX is a ‘balanced system” and can process SQL, move, and load balance data at 1GB/s per node (up to 1024 nodes). This is unheard of in the industry.
Re: Paul J. – if data is already co-located, of course dbX doesn’t move it around needlessly. However, when required, we load balance and redistribute without user intervention.
“load balance data at 1GB/s per node…This is unheard of in the industry.”
So, what’s the difference with Infiniband or 1GbE connecting any other fabric?
Thanks.
Or is it comparable to 10GbE (you use big-B)?
@Geno: “dbX 1008 has 96 disk drives, 1TB each or 96TB of raw disk. 30TB of that is for “user data”
Hang on, you’re saying 66TB of storage is used for non-user data? What is non-user data? Thanks.
I’m going to suggest people watch the “ChalkTalks” on our website for the full technical details. http://www.xtremedata.com . Our CTO, Faisal Shah, will be explaining a lot of the above questions in detail. We have one up there now :”SQL in Silicon” (FPGAs vs CPUs) and the others will be posted in the next 24,48 hours.
In short: We use IB. Our software and hardware know how to utilize this connection to its full extent. Our experience is CPUs cannot build statistics, move data, and perform “real SQL” at the same time at 1GB/s. Thus, we feel the CPU becomes the bottleneck in every solution. This is likely why everyone focuses on compression, or algos that “restrict data movement”. (for them it is too expensive (time) of a task)
We have the “SQL In Silicon” – In-Socket Accelerator that does this process at full rate. In short, data redistribution is basically “free” in our system. We are not afraid of moving data and do it under the hood so you don’t have too.
IB = 2GBytes/Sec in and out. Very low overhead.
1GE = 1Gbit/Sec.. lots of overhead.. I think around 80MB/Sec is about the real bandwidth.
Also: 66TB is for dbX to use. Temp space, RAID copies, other execution engine “database housekeeping”.
back of the napkin:
96 1TB SATA disks =
48GB RAID 10
30 GB user DATA
16 GB swap space
8 GB OS volume
I say “raid 10” meaning all data is mirrored to at least one other copy in the cluster.
The drives are SAS, and we’d be happy to discuss the detailed break down, exact RAID used, why, what features, and even roadmap with customers if they want/need to know.
Ok so, are those puppies yelping in the background during the ChalkTalk lecture?!?
Ok so I watched the ChalkTalk lecture — I’m even more convinced now that hardware is not needed so clearly I totally missed the boat here — it seems to me everything you do on your chip we do in software (how many ops do you burn in silicon?). It’s rare when I feel so retarded 🙂
This raises an interesting question. Are MPP databases by greenplum/netezza/et.al. generally either I/O bound or CPU bound?
It seems that the most dramatic improvements in this field have resulted from increased data throughput.
The point of MPP is to relax an I/O bottleneck that otherwise would be present. Some systems are still I/O bound. Other vendors claim to have gone so far the other way that they’re now CPU-bound. Bandwidth-bound is also a possibility.
The ideal, of course, is to have everything crumble at once, like the one-horse shay of the poem.
My take: Netezza would be I/O bound. Uses 1Ge as links, built from a “tree” of Ethernet switches, head node involved in data redistribution across SPUs. If data is not co-located for a given query, and thus data needs to be moved. System gets choked by ethernet switched and probably the head.
GP: Head involved in all data redistribution – thus CPU bound (Head is the choke point) if you build it with IB.
XtremeData: Data Nodes are piers in the execution. Head is not involved in redistribution. Systems has no choke point and is “balanced” as Curt mentions as “Ideal” above. At runtime, we do this balancing and redistribution at every step of a query at 1GByte/s/node.
Give us 9 JOINS with 9 different keys. The dbX system will do all the data movement work for you and with almost no impact to query performance.
Again, we are going after the ad hoc analytic market where this is important. The hardest queries (ad hoc) on the largest data (5TB to 5PB)
(JP: Those aren’t puppies, that is the marker squeaking on the whiteboard)
Geno,
I’m pretty sure that the amount of head-node involvement in data distribution at both Netezza and Greenplum — in currently shipping releases — is either 0, or else so small as not to matter significantly.
Meanwhile, Netezza repeatedly stresses that the FPGA — which handles I/O — is if anything underutilized.
So I’d guess they’re CPU bound.
There certainly are stories of Netezza systems being implemented with lots of underutilized disks, in the more spindles = better performance equation. So in some cases they are indeed I/O-bound. But in others I’d expect CPU is the limit.
I don’t really believe in the system that is perfectly balanced in all parts for all queries. Different queries stress different parts of the system.
Netezza does not use the head node for data redistribution. Data is moved directly between SPUs via the Ethernet switch. In most cases the head node is hardly used at all and is only rarely the bottleneck for any operation.
As far as Netezza being i/o, cpu, or network bound, as a user I cannot directly tell what the limiting factor is, but from what I have been told by engineers within Netezza the most common bottleneck is actual disk i/o. With compression in version 4.5 the increased effective i/o rate has more often moved the limiting factor to be the FPGA itself or in some cases the CPU, especially when the data compresses really well.
I would say that if one of the components of the system is always the bottleneck the system is not well designed. The resource requirements for any given query can vary greatly depending on what the query is doing.
In extreme cases one query might utilize i/o 100% but barely use CPU at all whereas another query hardly uses i/o but 100% utilizes the CPU. Queries which require that a huge fact table be redistributed would be limited by the network. It is all going to depend on what the query is doing. I would expect to see that behavior on any database whether it is Teradata, Netzza, Greenplum, Oracle, or whoever.
The goal should be to design a system so that it achieves the greatest average throughput for a large variety of workloads. The ‘average’ workload however is going to vary greatly from one customer to the next which means that the system will most likely be more often limited by a different component for different customers (or different workloads for the same customer).
@Shawn: “The goal should be to design a system so that it achieves the greatest average throughput for a large variety of workloads. The ‘average’ workload however is going to vary greatly from one customer to the next which means that the system will most likely be more often limited by a different component for different customers (or different workloads for the same customer).”
Sorry for reposting a long quote but this is absolutely key and central to what XSPRADA has been engineering/preaching for years! This is why it’s imperative to have a technology that can support all modalities no matter what the workload/query patterns are! One-trick ponies handling specific workloads for specific use cases are not adequate. You need to have the ability to apply the right technique for any question for any data at any time. Anything less than that will not long be tolerated in the BI world IMHO.
We posted the next “ChalkTalk” today titled “Unconstrained Data Exploration”. It is fitting and answers some of the open questions related to data redistribution, how we do it, and etc.
http://www.xtremedata.com/unconstrained.php
More ChalkTalks coming soon.
If anyone has ideas for future chalk talks.. send me an email: gvalente (at) xtremedata (dot) com
@Geno: So the thing to take away from this interesting virtual WB session is
1. all data movement is at 1GB/sec
2. you’re nodal partition scheme agnostic
3. you’re “model agnostic”
Right?
[…] XtremeData just launched in the new Netezza price range. […]
@Jerome – Yes, I think you got it. Add to #1; that not just data movement, but also ” real SQL processing” while being moved at that rate.
In summary: this give 16 nodes (1 tower) the abilty to hold 60TB of user data and process SQL at 1TB/Min regardless of data partition / data key.
I note that 1 terabyte/minute on 16 cores is a lot like the 1 gigabyte/second/core VectorWise talks about, e.g. http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133810
They don’t think FPGAs are needed in addition to the cores, however. 😉
Best,
CAM
Having been in the FPGA industry almost since its infancy, I can say that people who need the “BEST” size, weight, and power (SWaP) use FPGAs, especially when it comes to bit, byte, character, string manipulations, or packet processing. Cisco, Nortel, EMC, Moto, Alcatel, GE, and about 20K other customers publicly say that they rely on them to many things that CPUs just can’t do fast enough or inside the SWaP envelop. If you pick “one thing”, then yes CPUs can approach being the same, but let’s take a full data pipeline example.
CPUs can do JOINs, they can do compression, they can do encryption, and they can do encryption, but how is their performance at all four on streaming data where the CACHE is basically useless? Not very good to say the least.
A FPGA pipeline could do DECOMPRESS (GZIP-9), DECRYPT (AES-XTS 256 bit by the way or elliptical curve), then JOIN, then RECRYPT, then RECOMPRESS at the same performance through-put because of the power of pipelining and parallelization of custom reprogrammable silicon. O’ya.. AND it gives a new result every clock cycle. A large FPGA in this case is a MEN vs BOYS discussion.
We have been doing a webinar series called “Acceleration Academy” for all of 2009. I can say that Intel, SGI, HP, Altera, and may others have been on it to support these claims. Large Tier 1 companies have “Accelerator Strategies” (namely XDI FPGA accelerators and nVidia GPGPUs). In addition, Intel and AMD both created accelerator programs, called QuickAssist and Torenza, that included our patented Accelerator Technology as one of, if not THE, first member. People who doubt the power of FPGAs should watch this series: http://www.xtremedata.com/accelerationacademy (click on past webinars) or read the hundreds of pier reviewed papers on the power FPGA technology.
Additionally, we have over 200 customers that use the same hardware platform as DBX to make their own accelerated appliances (Financial, Genomics, Military Radar, etc) that all say “ X86 + FPGAs” are the way to go… and yesterday TwinFin made the same move that we’ve been advocating since 2004. We have the only Tier 1 approved In-Socket FPGA solution in the world (which is patented), which tightly couples these two technologies together better than any other way. XtremeData Accelerators are approved as an official HP accelerator (www.hp.com/go/accelerators to see our name, white papers, and the HP/Xtreme join PodCast) . This status makes our platform mainstream, available with Tier 1support, and customer proven time and time again to do its job faster, with less, and in less space than CPU + software alone.
The power of FPGAs: This is one thing that the folks at Netezza and I agree on.
[…] A recent discussion of the use of FPGAs for SQL operations in a post and comment thread around XtremeData’s product launch […]
[…] to imply. But in fact Kickfire just relies on standard chips, even if — like Netezza and XtremeData — Kickfire does rely on less programmer-friendly FPGAs to do some of what most rival vendors […]
Geno:
Do you have any TPC-H benchmarks to substantiate your throughput claims? Its always good to see a reputable independent source that performs consistent tests across multiple competing platforms. Without it, anybody can make any claim (eg. Oracle).
Huh?
TPC-H is a joke, and Oracle is one of the chief perpetrators of same.
CAM