Thank you all for your help.
The issue was caused by few failed disks in the cluster. Right after they
had been replaced everything worked well. Looking forward to moving to
spark 3.0 which is able to manage corrupted shuffle blocks
Cheers, Mike Pryakhin.
On Wed, 28 Aug 2019 at 03:44, Darshan Pa
you can also try to
set "spark.io.compression.codec" to "snappy" to try a different compression
codec
On Fri, Aug 16, 2019 at 10:14 AM Vadim Semenov
wrote:
> This is what you're looking for:
>
> Handle large corrupt shuffle blocks
> https://issues.apache.org/jira/browse/SPARK-26089
>
> So until
This is what you're looking for:
Handle large corrupt shuffle blocks
https://issues.apache.org/jira/browse/SPARK-26089
So until 3.0 the only way I can think of is to reduce the size/split your
job into many
On Thu, Aug 15, 2019 at 4:47 PM Mikhail Pryakhin
wrote:
> Hello, Spark community!
>
> I
Hello, Spark community!
I've been struggling with my job which constantly fails due to inability to
uncompress some previously compressed blocks while shuffling data.
I use spark 2.2.0 with all the configuration settings left by default (no
specific compression codec is specified). I've ascerta