Subject: Re: Re: Sort Shuffle performance issues about using AppendOnlyMap for
large data sets
Seeing similar issues, did you find a solution? One would be to increase the
number of partitions if you're doing lots of object creation.
On Thu, Feb 12, 2015 at 7:26 PM, fightf...@163.com wrot
> fightf...@163.com
>
>
> *From:* Patrick Wendell
> *Date:* 2015-02-12 16:12
> *To:* fightf...@163.com
> *CC:* user ; dev
> *Subject:* Re: Re: Sort Shuffle performance issues about using
> AppendOnlyMap for large data sets
> The map will start with a capacity of 64,
From: Patrick Wendell
Date: 2015-02-12 16:12
To: fightf...@163.com
CC: user; dev
Subject: Re: Re: Sort Shuffle performance issues about using AppendOnlyMap for
large data sets
The map will start with a capacity of 64, but will grow to accommodate
new data. Are you using the groupBy operator in Spark
The map will start with a capacity of 64, but will grow to accommodate
new data. Are you using the groupBy operator in Spark or are you using
Spark SQL's group by? This usually happens if you are grouping or
aggregating in a way that doesn't sufficiently condense the data
created from each input pa
Hi,
Really have no adequate solution got for this issue. Expecting any available
analytical rules or hints.
Thanks,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2015-02-09 11:56
To: user; dev
Subject: Re: Sort Shuffle performance issues about using AppendOnlyMap for
large data sets