Hi Tsai,

Could you share more information about the machine you used and the
training parameters (runs, k, and iterations)? It can help solve your
issues. Thanks!

Best,
Xiangrui

On Sun, Mar 23, 2014 at 3:15 AM, Tsai Li Ming <mailingl...@ltsai.com> wrote:
> Hi,
>
> At the reduceBuyKey stage, it takes a few minutes before the tasks start 
> working.
>
> I have -Dspark.default.parallelism=127 cores (n-1).
>
> CPU/Network/IO is idling across all nodes when this is happening.
>
> And there is nothing particular on the master log file. From the spark-shell:
>
> 14/03/23 18:13:50 INFO TaskSetManager: Starting task 3.0:124 as TID 538 on 
> executor 2: XXX (PROCESS_LOCAL)
> 14/03/23 18:13:50 INFO TaskSetManager: Serialized task 3.0:124 as 38765155 
> bytes in 193 ms
> 14/03/23 18:13:50 INFO TaskSetManager: Starting task 3.0:125 as TID 539 on 
> executor 1: XXX (PROCESS_LOCAL)
> 14/03/23 18:13:50 INFO TaskSetManager: Serialized task 3.0:125 as 38765155 
> bytes in 96 ms
> 14/03/23 18:13:50 INFO TaskSetManager: Starting task 3.0:126 as TID 540 on 
> executor 0: XXX (PROCESS_LOCAL)
> 14/03/23 18:13:50 INFO TaskSetManager: Serialized task 3.0:126 as 38765155 
> bytes in 100 ms
>
> But it stops there for some significant time before any movement.
>
> In the stage detail of the UI, I can see that there are 127 tasks running but 
> the duration each is at least a few minutes.
>
> I'm working off local storage (not hdfs) and the kmeans data is about 6.5GB 
> (50M rows).
>
> Is this a normal behaviour?
>
> Thanks!

Reply via email to