Thanks bobby.
I will try more times.
Is there any more fine-grained profile tools for each task? For example,
cpu utilization, disk and network IO for each task.
2013/9/9 Robert Evans
> How many times did you run the experiment at each setting? What is the
> standard deviation for each o
How many times did you run the experiment at each setting? What is the
standard deviation for each of these settings. It could be that you are
simply running into the error bounds of Hadoop. Hadoop is far from
consistent in it's performance. For our benchmarking we typically will
run the test 5
But I still want to fine the most efficient assignment and scale both data
and nodes as you said, for example in my result, 2 is the best, and 8 is
better than 4.
Why is it sub-linear from 2 to 4, super-linear from 4 to 8. I find it is
hard to model this result. Can you give me some hint about thi
Clearly your input size isn't changing. And depending on how they are
distributed on the nodes, there could be Datanode/disks contention.
The better way to model this is by scaling the input data also linearly. More
nodes should process more data in the same amount of time.
Thanks,
+Vinod
On
>From 2 to 4, the performance increase sub-linearly, however from 4 to 8, it
seems super-linear.
Is it caused by some disk contention bottleneck?
2013/9/6 牛兆捷
> Hi all:
>
> I vary the computational nodes of cluster and get the speedup result in
> attachment.
>
> In my mind, there are three typ