Re: [PR] [do not review] experimental support for lz4 compression (not working) [datafusion-comet]

via GitHub Wed, 18 Dec 2024 16:16:25 -0800


andygrove commented on PR #1181:
URL: 
https://github.com/apache/datafusion-comet/pull/1181#issuecomment-2552512228


   Here are my findings from hacking on this today.
   
   LZ4 provides two compression formats: `LZ4 Block Format` and `LZ4 Frame 
Format`.
   
   Spark uses the Java library https://github.com/lz4/lz4-java and specifically 
uses `LZ4BlockOutputStream` which seems to be a proprietary streaming LZ4 
format, as noted in the documentation:
   
   ```
   /**
    * Streaming LZ4 (not compatible with the LZ4 Frame format).
    * This class compresses data into fixed-size blocks of compressed data.
    * This class uses its own format and is not compatible with the LZ4 Frame 
format.
    * For interoperability with other LZ4 tools, use {@link 
LZ4FrameOutputStream},
    * which is compatible with the LZ4 Frame format. This class remains for 
backward compatibility.
    * @see LZ4BlockInputStream
    * @see LZ4FrameOutputStream
    */
   public class LZ4BlockOutputStream extends FilterOutputStream {
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] [do not review] experimental support for lz4 compression (not working) [datafusion-comet]

Reply via email to