Re: executor delay in Spark

2016-04-24 Thread Jeff Zhang
Maybe this is due to config spark.scheduler.minRegisteredResourcesRatio, you can try set it as 1 to see the behavior. // Submit tasks only after (registered resources / total expected resources) // is equal to at least this value, that is double between 0 and 1. var minRegisteredRatio = math.m

Re: executor delay in Spark

2016-04-24 Thread Mike Hynes
Could you change numPartitions to {16, 32, 64} and run your program for each to see how many partitions are allocated to each worker? Let's see if you experience an all-nothing imbalance that way; if so, my guess is that something else is odd in your program logic or spark runtime environment, but

Re: executor delay in Spark

2016-04-22 Thread Mike Hynes
Glad to hear that the problem was solvable! I have not seen delays of this type for later stages in jobs run by spark-submit, but I do not think it impossible if your stage has no lineage dependence on other RDDs. I'm CC'ing the dev list to report of other users observing load imbalance caused by