RE: [External] Re: [Spark RPC] Help how to debug sudden performance issue

2019-03-11 Thread Hough, Stephen C
[mailto:jornfra...@gmail.com] Sent: Monday, March 11, 2019 4:28 PM To: Hough, Stephen C Cc: Hough, Stephen C ; dev@spark.apache.org Subject: Re: [External] Re: [Spark RPC] Help how to debug sudden performance issue Well it will be difficult to say anything without knowing func. It could be that

Re: [External] Re: [Spark RPC] Help how to debug sudden performance issue

2019-03-11 Thread Jörn Franke
day, March 11, 2019 3:08 PM > To: Hough, Stephen C > Cc: dev@spark.apache.org > Subject: [External] Re: [Spark RPC] Help how to debug sudden performance issue > > Well it is a little bit difficult to say, because a lot of things are mixing > up here. What function is calculated? Do

RE: [External] Re: [Spark RPC] Help how to debug sudden performance issue

2019-03-11 Thread Hough, Stephen C
the logs. From: Jörn Franke [mailto:jornfra...@gmail.com] Sent: Monday, March 11, 2019 3:08 PM To: Hough, Stephen C Cc: dev@spark.apache.org Subject: [External] Re: [Spark RPC] Help how to debug sudden performance issue Well it is a little bit difficult to say, because a lot of things are mixing

Re: [Spark RPC] Help how to debug sudden performance issue

2019-03-11 Thread Jörn Franke
Well it is a little bit difficult to say, because a lot of things are mixing up here. What function is calculated? Does it need a lot of memory? Could it be that you run out of memory and some spillover happens and you have a lot of IO to disk which is blocking? Related to that could be 1 exec

[Spark RPC] Help how to debug sudden performance issue

2019-03-10 Thread Hough, Stephen C
Spark Version: 2.0.2 I am running a cluster with 400 workers where each worker has 1 executor configured with 40 cores for a total capacity of 16,000 cores. I run about 10,000 jobs with 1.5M tasks where the job is a simple spark.parallelize(list, list.size()).map(func).collectAsysnc(). The job