We tried changing the compression codec from snappy to lz4. It did improve the performance but we are still wondering why default options didn’t work as claimed.
From: Raghavendra Pandey <raghavendra.pan...@gmail.com<mailto:raghavendra.pan...@gmail.com>> Date: Friday, 6 February 2015 1:23 pm To: Praveen Garg <praveen.g...@guavus.com<mailto:praveen.g...@guavus.com>> Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Re: Shuffle read/write issue in spark 1.2 Even I observed the same issue. On Fri, Feb 6, 2015 at 12:19 AM, Praveen Garg <praveen.g...@guavus.com<mailto:praveen.g...@guavus.com>> wrote: Hi, While moving from spark 1.1 to spark 1.2, we are facing an issue where Shuffle read/write has been increased significantly. We also tried running the job by rolling back to spark 1.1 configuration where we set spark.shuffle.manager to hash and spark.shuffle.blockTransferService to nio. It did improve the performance a bit but it was still much worse than spark 1.1. The scenario seems similar to the bug raised sometime back https://issues.apache.org/jira/browse/SPARK-5081. Has anyone come across any similar issue? Please tell us if any configuration change can help. Regards, Praveen