Re: [DISCUSS] SPIP: Faster queries in local laptop mode for Apache Spark

Cheng Pan Wed, 06 May 2026 19:59:11 -0700

+1. And I leave a comment in the docs about the Hadoop client improvement, 
which should also benefit running Spark on the laptop.


Thanks,
Cheng Pan



> On May 6, 2026, at 15:01, John Zhuge <[email protected]> wrote:
> 
> +1 worthwhile to lower Spark small-data overhead
> 
> On Mon, May 4, 2026 at 11:47 PM Ángel Álvarez Pascua 
> <[email protected] <mailto:[email protected]>> 
> wrote:
>> Love it. Please, count on me if any help is needed.
>> 
>> El mar, 5 may 2026, 7:31, DB Tsai <[email protected] 
>> <mailto:[email protected]>> escribió:
>>> Thanks Daniel and Liang-Chi for driving this. This is an exciting proposal 
>>> that can significantly speed up local experimentation and development on 
>>> laptops. It also helps make Spark a great fit for both big-data workloads 
>>> and small-data exploratory workflows.
>>> 
>>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 0x9FB9FAA3
>>> 
>>> On Monday, May 4th, 2026 at 3:39 PM, Daniel Tenedorio 
>>> <[email protected] <mailto:[email protected]>> wrote:
>>>> Hi Spark community,
>>>> 
>>>> We’d like to propose a new SPIP to improve the experience of running 
>>>> Apache Spark on laptops.
>>>> 
>>>> SPIP doc:
>>>> 
>>>> https://docs.google.com/document/d/1Nphejrf_vh4YRECn0JPgKClqxDS_lB6wufZFJQxyY98/edit?tab=t.0#heading=h.hj76akdx5ul
>>>> 
>>>> Summary:
>>>> 
>>>> Spark’s execution model is optimized for distributed workloads, but this 
>>>> introduces noticeable overhead for small datasets (e.g., <100MB), where 
>>>> even simple queries can take multiple seconds. This makes Spark less 
>>>> suitable for interactive and exploratory use cases on laptops, and often 
>>>> pushes users toward alternative single-node tools.
>>>> 
>>>> This proposal aims to reduce that overhead in local mode, improving 
>>>> latency for small queries and making Spark more usable as an entry point 
>>>> for new users and iterative workflows.
>>>> 
>>>> We’d appreciate your review and feedback.
>>>> 
>>>> Thanks,
>>>> Daniel Tenedorio and Liang-Chi Hsieh
>>>> 
>>> 
> 
> 
> 
> --
> John Zhuge

Re: [DISCUSS] SPIP: Faster queries in local laptop mode for Apache Spark

Reply via email to