On May 28, 2009, at 10:32 AM, Ian Soboroff wrote:

Brian Bockelman <[email protected]> writes:

Despite my trying, I've never been able to come even close to pegging
the CPUs on our NN.

I'd recommend going for the fastest dual-cores which are affordable --
latency is king.

Clue?

Surely the latencies in Hadoop that dominate are not cured with faster
processors, but with more RAM and faster disks?

I've followed your posts for a while, so I know you are very experienced
with this stuff... help me out here.

Actually, that's more of a gut feeling than informed decision. Because the locking is rather coarse-grained, having many CPUs isn't going to win anything -- I'd rather any CPU-related portions to go as fast as possible. Under the highest load, I think we've been able to get up to 25% CPU utilization: thus, I'm guessing any CPU-related improvements will come from faster ones, not more cores.

For my cluster, if I had a lot of money, I'd spend it on a hot-spare machine. Then, I'd spend it on upgrading the RAM, followed by disks, followed by CPU.

Then again, for the cluster in the original email, I'd save money on the namenode and buy more datanodes. We've got about 200 nodes and probably have a comparable NN.

Brian

Reply via email to