[mailto:jornfra...@gmail.com]
Sent: Monday, March 11, 2019 4:28 PM
To: Hough, Stephen C
Cc: Hough, Stephen C ; dev@spark.apache.org
Subject: Re: [External] Re: [Spark RPC] Help how to debug sudden performance
issue
Well it will be difficult to say anything without knowing func. It could be
that
day, March 11, 2019 3:08 PM
> To: Hough, Stephen C
> Cc: dev@spark.apache.org
> Subject: [External] Re: [Spark RPC] Help how to debug sudden performance issue
>
> Well it is a little bit difficult to say, because a lot of things are mixing
> up here. What function is calculated? Do
the logs.
From: Jörn Franke [mailto:jornfra...@gmail.com]
Sent: Monday, March 11, 2019 3:08 PM
To: Hough, Stephen C
Cc: dev@spark.apache.org
Subject: [External] Re: [Spark RPC] Help how to debug sudden performance issue
Well it is a little bit difficult to say, because a lot of things are mixing
Well it is a little bit difficult to say, because a lot of things are mixing up
here. What function is calculated? Does it need a lot of memory? Could it be
that you run out of memory and some spillover happens and you have a lot of IO
to disk which is blocking?
Related to that could be 1 exec
Spark Version: 2.0.2
I am running a cluster with 400 workers where each worker has 1 executor
configured with 40 cores for a total capacity of 16,000 cores.
I run about 10,000 jobs with 1.5M tasks where the job is a simple
spark.parallelize(list, list.size()).map(func).collectAsysnc(). The job