Hardware and storage notes
My California trip last week focused mainly on software — duh! — but I had some interesting hardware/storage/architecture discussions as well, especially in the areas of:
- Rack- or data-center-scale systems.
- The real or imagined demise of Moore’s Law.
- Flash.
I also got updated as to typical Hadoop hardware.
If systems are designed at the whole-rack level or higher, then there can be much more flexibility and efficiency in terms of mixing and connecting CPU, RAM and storage. The Google/Facebook/Amazon cool kids are widely understood to be following this approach, so others are naturally considering it as well. My most interesting of several mentions of that point was when I got the chance to talk with Berkeley computer architecture guru Dave Patterson, who’s working on plans for 100-petabyte/terabit-networking kinds of systems, for usage after 2020 or so. (If you’re interested, you might want to contact him; I’m sure he’d love more commercial sponsorship.)
One of Dave’s design assumptions is that Moore’s Law really will end soon (or at least greatly slow down), if by Moore’s Law you mean that every 18 months or so one can get twice as many transistors onto a chip of the same area and cost than one could before. However, while he thinks that applies to CPU and RAM, Dave thinks flash is an exception. I gathered that he thinks the power/heat reasons for Moore’s Law to end will be much harder to defeat than the other ones; note that flash, because of what it’s used for, has vastly less power running through it than CPU or RAM do.
Otherwise, I didn’t gain much new insight into actual flash uptake. Everybody thinks flash is or soon will be very important; but in many segments, folks are trading off disk vs. RAM without worrying much about the intermediate flash alternative.
I visited two Hadoop distribution vendors this trip, namely the ones who are my clients – Cloudera and MapR. I remembered to ask one of them, Cloudera, about typical Hadoop hardware, and got answers that sounded consistent with hardware trends Hortonworks told me about last August. The story is, more or less:
- The default assumption remains $20-30K/node, 2 sockets, 12 disks. (Edit: See lively price discussion in the comments below.)
- Most hardware vendors have standard/default Hadoop boxes by now, and in many cases customers just buy what’s on offer.
- The aforementioned disks sometimes get up to 4 terabytes now.
- 128GB is now the norm for RAM. 256GB is common. Higher amounts are seen, up to – in rare cases – 2-4 TB.
- Flash is of interest, but isn’t being demanded much yet. This could change when flash’s storage density matches disk’s.
- Flash interest is highest for Impala.
Cloudera suggested that the larger amounts of RAM tend to be used when customers frame the need as putting certain analytic datasets entirely in RAM. This rings true to me; there’s lots of evidence that users think that way, and not just in analytic cases. This is probably one of the reasons that they often jump straight from disk to RAM without fully exploring the opportunities of flash.
One last thing — the big cloud vendors are at least considering the use of their own non-Intel chip designs, which might be part of the reason for Intel’s large Hadoop investment.
Comments
19 Responses to “Hardware and storage notes”
Leave a Reply
[…] Cloudera and Intel will of course talk a lot about adapting to computer architecture opportunities and trends. I expect that part to go well, because Intel has strong relationships of that kind even with […]
$20 to $30k per node –> is that a typo?
No, it’s not. Enterprises buy much more expensive gear than Google/Facebook/et al.
I also think that’s a typo. Dell’s R720XD and HP’s DL380 servers can be had for $3,000 to $5,000 a crack without huge volumes. See http://www.informationweek.com/big-data/hardware-architectures/10-hadoop-hardware-leaders/d/d-id/1234772?image_number=1
For the R720XD, 3000$ is 1 CPU, 2x500GB of HD, 2GB of memory.
2 CPU, 128GB of memory and 12x3TB disks moves the non discount price to 20K$. So I don’t think that’s a typo.
There are published prices and then there are street prices for enterprises buying across all data-center needs. With 12 3TB or 4TB drives you’re cracking five figures, but those filling racks and data centers get steep discounts that you don’t see online.
Just priced one with “discount” Dell offered to me – 2×8 cores, 64 GB RAM, 12 3TB 7200 SATA disks — $13k. I can’t imagine paying $20k to $30k for a Hadoop node. But then I don’t buy much HW on the retail market or for a low-volume purchaser.
I still think $20,000 is too high, but I just corrected/revised the upward end on the price range I quoted in my slide show to $15,000. This reflects the inclusion of 12 high-capacity drives. It appears the $2,500 to $5,000 range a couple of sources cited related to Management/NameNodes with just two to six drives.
Proper price range for those dual socket 12-disk servers (depending on the CPU speed, disk and RAM sizes) is from ~$6,500 (with 12x2TB SAS drives and 128GB RAM and 2x10GbE) to ~$10,000 (upgraded with 12x4TB SAS drives and 256GB RAM and faster CPU and dual SSD for OS drive).
Above that you’ll be paying for brand name – not for speed or quality/reliability.
Thanks for the info! I don’t know how to reconcile everybody’s different figures at this time. To put it mildly, vanilla hardware is not one of my specialties …
Master/NameNodes and jobtracker nodes (if separate from NameNode) don’t require 12 disks, so I kept my quoted range at $2,500 (high-volume white box) to $15,000 (12-disk, semi-loaded brand-name box).
One note — we’re talking RAM amounts that are greater than Mark used.
In my experience the middle of the road spec right now is:
– 2 sockets, 8 cores / socket
– 2 x 1GigE bonded
– 12 spindles, SATA, 2-3TB
– 128GB RAM
I don’t want to throw out a specific cost per node because I’ll surely get it wrong. Let’s say something like $7k – $14k depending on the vendor and the discounts of that particular customer.
Most popular option to up-spec is RAM. It’s cheap to go to 256GB or 384GB. Second most popular option would be to spring for 10gigE. Still a bit rich for my blood but a larger minority are opting for this because a) they are getting good volume discounts from the networking vendors and b) there’s a sense of future-proofing (i.e. assuming when they expand the cluster in a year or two the port cost will be lower).
Charles — thanks!
Everybody — let’s go with his figures. 🙂
Actually, dual 10GbE (embedded 10GBase-T) only cost $100 extra on Supermicro motherboards.
Switch prices for 10GbE also went down recently.
@Igor – yes the per-node cost of 10GbE is not bad, it’s the per-port cost in the top of rack switches that tends to slow its adoption. Customers that have great discounts with Cisco, Arista, etc will more often opt for 10GbE.
Everyone should pay for the fastest network that can be afforded as a priority -InfiniBand even. The shuffle and sort routines can move the bottleneck to the network.
[…] in Moore’s Law continuing for at least 5 more years. (I think that’s a near-consensus; the 2020s, however, are another […]
Make sure that the company that you choose from is a member or is certified by the association.
The roommates said that they ‘tried unsuccessfully” to get King to move out of the apartment. The process of cleaning huge building requires ample planning and patience.