Re: hadoop1.2.1 speedup model

2013-09-07 Thread Vinod Kumar Vavilapalli
Clearly your input size isn't changing. And depending on how they are distributed on the nodes, there could be Datanode/disks contention. The better way to model this is by scaling the input data also linearly. More nodes should process more data in the same amount of time. Thanks, +Vinod On

Re: hadoop1.2.1 speedup model

2013-09-07 Thread 牛兆捷
But I still want to fine the most efficient assignment and scale both data and nodes as you said, for example in my result, 2 is the best, and 8 is better than 4. Why is it sub-linear from 2 to 4, super-linear from 4 to 8. I find it is hard to model this result. Can you give me some hint about thi