RE: Inquiry: Best Practices for Replacing Snappy with LZ4/LZF Compression Across Spark Codebase (including test cases)

Balaji Sudharsanam V Mon, 02 Jun 2025 08:42:15 -0700

Missed to mention , we are exploring this in Spark 4.0.

Be it a configuration change or explicit code changes, throughout. We are keen 
to accommodate the recommended and the future proof solution approach.

Any guidance, insights, or pointers to relevant documentation, JIRAs, or 
previous discussions on this topic would be immensely helpful.

Thanks,
Balaji

From: Balaji Sudharsanam V
Sent: 02 June 2025 21:02
To: [email protected]
Cc: Dongjoon Hyun <[email protected]>; Steven Jones <[email protected]>; 
NICHOLAS MARION <[email protected]>; Vishal Kolki <[email protected]>; ANTO 
JOHN <[email protected]>
Subject: Inquiry: Best Practices for Replacing Snappy with LZ4/LZF Compression 
Across Spark Codebase (including test cases)

Dear Spark Developer Community,
I hope this email finds you well.
My name is Balaji, and I am a Software Engineer working with Apache Spark in 
IBM Z Systems (z/OS).
We are exploring a scenario where we would like to move away from using the 
Snappy compression library within our Spark applications and leverage either 
LZ4 or LZF compression exclusively. This includes ensuring that all data 
persistence, shuffle operations, and internal data representations consistently 
utilize the chosen alternative (LZ4 or LZF), including the test cases.

Product Owner, IBM Z Platform for Apache Spark
India Systems Development Lab
Bangalore, EGL D Block 6th Floor
Mobile : +91 9600778246
Mail : [email protected]<mailto:[email protected]>

RE: Inquiry: Best Practices for Replacing Snappy with LZ4/LZF Compression Across Spark Codebase (including test cases)

Reply via email to