Hi Pavel!
Sorry for late. I just do some investigation in these days with my colleague.
Here is my thought: from spark 1.2, we use Netty with off-heap memory to reduce
GC during shuffle and cache block transfer. In my case, if I try to increase
the memory overhead enough. I will get the Max dir
Hi Yang!
I don't know exactly why this happen, but i think GC can't work to fast
enough or size of data with additional objects created while computations
to big for executor.
And i found that this problem only if you make some data manipulations. You
can cache you data first, after that, write in
Also, do you know why this happen?
> On 2017年1月20日, at 18:23, Pavel Plotnikov
> wrote:
>
> Hi Yang,
> i have faced with the same problem on Mesos and to circumvent this issue i am
> usually increase partition number. On last step in your code you reduce
> number of partitions to 1, try to set
Hi,
Thank you for your suggestion. As I know If I set to bigger number I won’t get
the output number as one file, right? My task is design to combine all that
small files in one day to one big parquet file. THX again.
Best,
> On 2017年1月20日, at 18:23, Pavel Plotnikov
> wrote:
>
> Hi Yang,
> i
Hi Yang,
i have faced with the same problem on Mesos and to circumvent this issue i
am usually increase partition number. On last step in your code you reduce
number of partitions to 1, try to set bigger value, may be it solve this
problem.
Cheers,
Pavel
On Fri, Jan 20, 2017 at 12:35 PM Yang Cao
Hi all,
I am running a spark application on YARN-client mode with 6 executors (each 4
cores and executor memory = 6G and Overhead = 4G, spark version: 1.6.3 /
2.1.0). I find that my executor memory keeps increasing until get killed by
node manager; and give out the info that tells me to boost