Teradata hardware strategy and tactics
In my opinion, the most important takeaways about Teradata’s hardware strategy from the Teradata Partners conference last week are:
- Teradata’s future lies in solid-state memory. That’s in line with what Carson Schmidt told me six months ago.
- To Teradata’s surprise, the solid-state future is imminent. Teradata is 6-9 months further along with solid-state drives (SSD) than it thought a year ago it would be at this point.
- Short-term, Teradata is going to increase the number of appliance kinds it sells. I didn’t actually get details on anything but the new SSD-based Blurr, but it seems there will be others as well.
- Teradata’s eventual future is to mix and match parts (especially different kinds of storage) in a more modular product line. Teradata Virtual Storage is of pretty limited value otherwise. I probably believe Teradata will go modular more emphatically than Teradata itself does, because I think doing so will meet users needs more effectively than if Teradata relies strictly on fixed appliance configurations.
In addition, some non-SSD componentry tidbits from Carson Schmidt include:
- Teradata really likes Intel’s Nehalem CPUs, with special reference to multi-threading, QuickPath interconnect, and integrated memory controller. Obviously, Nehalem-based Teradata boxes should be expected in the not too distant future.
- Teradata really likes Nehalem’s successor Westmere too, and expects to be pretty fast to market with it (faster than with Nehalem) because Nehalem and Westmere are plug-compatible in motherboards.
- Teradata will go to 10-gigabit Ethernet for external connectivity on all its equipment, which should improve load performance.
- Teradata will also go to 10-gigabit Ethernet to play the Bynet role on appliances. Tests are indicating this improves query performance.
- What’s more, Teradata believes there will be no practical scale-out limitations with 10-gigabit Ethernet.
- Teradata hasn’t decided yet what to do about 2.5” SFF (Small Form Factor) disk drives, but is leaning favorably. Benefits would include lower power consumption and smaller cabinets.
- Also on Carson’s list of “exciting” future technologies is SAS 2.0, which at 6 gigabits/second doubles the I/O bandwidth of SAS 1.0.
- Carson is even excited about removing universal power supplies from the cabinets, increasing space for other components.
- Teradata picked Intel’s Host Bus Adapters for 10-gigabit Ethernet. The switch supplier hasn’t been determined yet.
Let’s get back now to SSDs, because over the next few years they’re the potential game-changer. The big news on SSDs is that after last year’s Teradata Partners conference, a stealth supplier* introduced itself and convinced Teradata it offers really great SSD technology. For example, not a single SSD it has provided Teradata has ever failed. (In hardware, that is. There have of course been firmware bugs, suitably squashed.) I think SSD performance is also exceeding Teradata’s expectations. This supplier is where the 6-9 month time-to-market gain comes from.
*Based on how often the concept of “stealth” and “name is NDAed” came up, I do not believe this is the SSD company another vendor told me about that is going around claiming it has a Teradata relationship.
Teradata SSD highlights include:
- I/O speeds on “random medium blocks” are 520 megabytes/second, vs. 15 MB/second on their fastest disks. And that’s limited by SAS 1.0, load-balanced across two devices, not the hardware itself. (2 x 300+ MB/sec turns out to be 520 MB/sec in this case.) No wonder Carson is excited about SAS 2.0.
- Teradata is using SAS interfaces for its SSDs, and believes that’s unusual, in that other companies are using SATA or Fibre Channel.
- Never having had a part fail, Teradata has no real basis to make MTTF (Mean Time To Failure) estimates for its SSDs.
- Teradata’s SSD appliance design includes no array controllers. The biggest reason is that right now array controllers can’t keep up with the SSDs’ speed.
- In its SSD appliance, Teradata has abandoned RAID, doing mirroring instead via a DBMS feature called Fallback that’s been around since Teradata’s earliest days. (However, unlike Oracle in Exadata, Teradata continues to use RAID for disks.)
- Useful life for Teradata’s SSDs is estimated at 5-7 years.
- Teradata’s SSDs are SLC (Single-Level Cell), as opposed to MLC (Multi-Level Cell).
Comments
13 Responses to “Teradata hardware strategy and tactics”
Leave a Reply
Curt,
Great insights into the recent TD announcements. I haven’t seen any of this mentioned elsewhere, as usual.
Two questions occur to me :
First, if 10Gb ethernet improves query time versus Bynet, why on earth have they persisted with Bynet for all this time? Given the amount of data that TD shuffles between AMPs I would think this is their top priority. I wonder if the previous disk subsystems were the limiter, as opposed to the interconnect fabric.
Second, if the SSD I/O is limited by SAS 1.0 speeds, why are they even using SAS at all. The flash cards in Exadata 2 are connected via PCI Express (AFAIK) which delivers 4Gb/s+. I’m out of my depth here a little bit but, simplistically, if Oracle can do it why can’t Teradata?
Joe
Joe,
For Teradata, there’s a lag between “We think X is the right technology to use” and “We’re sure enough to use X.” Perhaps not coincidentally, few Teradata users are ever dissatisfied about much except price or, sometimes, complexity.
Open question;
Doesn’t SSD technology greatly expand the size that SMP DW severs can scale to? Especially in the low to mid-range data warehouses? And with SSD prices dropping, won’t it soon make sense to spend dollars on SSD SAN versus MPP servers.
For example; How would a 8 cpu/6 core server (48 cores) running SMP database software (take your pick Oracle, DB2 etc) with 256 Gig memory (total price ~$80K @ HP.com) with a 5TB SSD SAN (~ $200K RAMSAN) compete with a MPP players like Teradata or Netezza? Remember the lower energy costs of SSD and the OLTP functionality of SMP DB software. Which gives it more ODS flexibility. Again, for the low to mid-range DW it seems like a viable alternative?
1. Kickfire claims to give great DW performance on any schema up to at least 5 TB of data on a single box for <$200K. And that 5 TB will go up a lot as the get compression ironed out. My point: There are lots of good alternatives. 2. You can't just add sufficiently well-performing long-running queries into a good OLTP DBMS and say you have a good mixed-used system. E.g., workload management gets a lot harder. A DW should ideally have more optimistic locking than an OLTP system. Etc. 3. In theory, you can break tables into sufficiently small pieces and store them in sufficiently different ways that you can do everything with great performance and reliability in one system. That's theory. 4. The cost of hardware is often a small factor when compared with the cost of DBMS licenses. Just look at Oracle's software cost per core (license and maintenance). Whether you're paying $2,500/core for the hardware or (more likely) a lot less is, by comparison, pretty irrelevant.
FYI…
Just following up on my second point regarding the use of a SAS disk interface versus PCIe, a new post from James Hamilton (one of the genii behind AWS) appears to concur.
“Expect flash to stay strong and relevant for the near term and expect it to be PCIe connected rather than SATA attached.”
http://perspectives.mvdirona.com/2009/10/26/AndyBechtolsheimAtHPTS2009.aspx
Admittedly this is in reference to a presentation given by a Sun employee. Nevertheless, disk interfaces are already too slow before TD even release their flash appliance and that seems like a problem to me.
Joe
@Joe Harris
If you are reading between the lines here, you will see that Teradata’s I/O layer is very inefficient for a data warehouse. If you look around almost every DW DBMS vendor is focusing on maximizing sequential I/O so that they can get the most disk bandwidth out of each HDD. The quote of “15 MB/second on their fastest disks” is unfortunate as those drives are physically capable of doing well over 100MB/s of sequential I/O. This likely has to do with the small random read pattern that is common with Teradata. Michael McIntire commented on this back in April. This is also the reason that Teradata’s systems have a very large number of HDDs in them. For instance, the 5555 series generally consists of 3/5 cliques which is 3 dual socket quad core Intel Harpertown hosts (plus one spare host) and 5 fibre channel arrays each with 60 HDDs, so the ratio of HDDs to hosts is 100:1. The capacity of these drives has been 146GB for quite some time, and I hear they are starting to now offer the 300GB drives. Compare this to other vendors (say like Oracle Exadata) where the drive capacities are 600GB SAS (or 2TB Midline SAS) and the scan rates per disk are around 125MB/s – over 100MB/s more per HDD than Teradata. In comparison, a single Sun Oracle DB Machine has a physical scan capacity of 21GB/s with 168 SAS HDDs and a Teradata 12 node 1200 drive 5555 system has a physical scan capacity 18GB/s (1200*15MB/s), so what Exadata can do in 1 rack, it takes Teradata on the order of 8 or so racks of equipment. Knowing this, it now makes sense why Teradata is so “excited” about using SSD. By doing so, it removes the “latency penalty” for random I/O operations and has the opportunity to drastically shrink their data center footprint, though of course at the increased cost of SSD vs HDD.
[…] summary of some of the issues and opportunities in an exeptionally useful post from Curt Monash here. But one hopes Teradata rationalizes the bewildering story, the wild mix of OSs, interconnects […]
[…] Teradata, Oracle have both signaled moving to more modularity […]
[…] in its own software. (Other examples of this strategy would be Vertica, Oracle Exadata/ASM, and Teradata Fallback.) Prior to nCluster 4.0, this caused a problem, in that the block sizes for mirroring were so large […]
[…] Teradata has made large strides in making solid-state memory useful […]
Can I use Teradata for OLTP system?
Please give me some pros and cons about the OLTP in Teradata.
Teradata is not optimized for cost-effective OLTP performance. And third-party OLTP apps generally don’t run on Teradata. On the other hand, formally it has everything you’d need to do OLTP.
Bottom line: If you want to do a little OLTP on a Teradata box you buy for other reasons, fine. But if your main goal is OLTP, it’s the wrong thing to get.
I should add that Teradata’s architecture is somewhat more OLTP-like that newer analytic DBMS. E.g., it is designed for random more than sequential reads, it assumes you’ll probably normalize your data, you can buy Teradata boxes with lots of small, fast disks, and so on.
[…] and Teradata’s beliefs about the importance of solid-state […]