Re: GZIP compression support for Spark internal data

2016-09-12 Thread Nasser Ebrahim
Thank you Takeshi for sharing the info. I agree with Patrick and you that there is no point in adding more codec unless it is showing better performance results (at least with some work loads on some platforms). The performance of GZIP depends upon its implementation on the platforms. Will do s

Re: GZIP compression support for Spark internal data

2016-09-12 Thread Takeshi Yamamuro
Hi, Have you seen https://issues.apache.org/jira/browse/SPARK-4633 ? // maropu On Mon, Sep 12, 2016 at 11:00 PM, Nasser Ebrahim wrote: > Hi, > > Can we use GZIP compression for internal data such as RDD partitions, > broadcast variables and shuffle outputs so that user will have more choice >

GZIP compression support for Spark internal data

2016-09-12 Thread Nasser Ebrahim
Hi, Can we use GZIP compression for internal data such as RDD partitions, broadcast variables and shuffle outputs so that user will have more choice compared to the available LZ4, LZF and Snappy? Is there any specific reason we are not supporting the JDK inbuilt compression? If not, shall I