Re: GZIP compression support for Spark internal data

Nasser Ebrahim Mon, 12 Sep 2016 12:06:14 -0700

Thank you Takeshi for sharing the info. I agree with Patrick and youthat there is no point in adding more codec unless it is showing betterperformance results (at least with some work loads on some platforms).The performance of GZIP depends upon its implementation on theplatforms. Will do some performance tests to see how it is performingcompared to the existing codec in spark.


On 9/12/16 9:19 PM, Takeshi Yamamuro wrote:

Hi,

Have you seen https://issues.apache.org/jira/browse/SPARK-4633 ?

// maropu

On Mon, Sep 12, 2016 at 11:00 PM, Nasser Ebrahim<enas...@linux.vnet.ibm.com <mailto:enas...@linux.vnet.ibm.com>> wrote:


    Hi,

    Can we use GZIP compression for internal data such as RDD
    partitions, broadcast variables and shuffle outputs so that user
    will have more choice compared to the available LZ4, LZF and
    Snappy?  Is there any specific reason we are not supporting the
    JDK inbuilt compression? If not, shall I create a JIRA to get this
    implemented.

    Thank you,
    Nasser Ebrahim




--
---
Takeshi Yamamuro

Re: GZIP compression support for Spark internal data

Reply via email to