Hi Dongjoon,

This is a big decision but not a big project. We just need to update the
release scripts to produce the additional Spark distribution. If people are
positive about this, I can start to implement the script changes now and
merge it after this proposal has been voted on and approved.

Thanks,
Wenchen

On Tue, Feb 4, 2025 at 4:10 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
wrote:

> Hi, Wenchen.
>
> I'm wondering if this implies any delay of the existing QA and RC1
> schedule or not.
>
> If then, why don't we schedule this new alternative proposal on Spark 4.1
> properly?
>
> Best regards,
> Dongjoon
>
> On Mon, Feb 3, 2025 at 23:31 Wenchen Fan <cloud0...@gmail.com> wrote:
>
>> Hi all,
>>
>> There is partial agreement and consensus that Spark Connect is crucial
>> for the future stability of Spark APIs for both end users and developers.
>> At the same time, a couple of PMC members raised concerns about making
>> Spark Connect the default in the upcoming Spark 4.0 release. I’m proposing
>> an alternative approach here: publish an additional Spark distribution with
>> Spark Connect enabled by default. This approach will help promote the
>> adoption of Spark Connect among new users while allowing us to gather
>> valuable feedback. A separate distribution with Spark Connect enabled by
>> default can promote future adoption of Spark Connect for languages like
>> Rust, Go, or Scala 3.
>>
>> Here are the details of the proposal:
>>
>>    - Spark 4.0 will include three PyPI packages:
>>       - pyspark: The classic package.
>>       - pyspark-client: The thin Spark Connect Python client. Note, in
>>       the Spark 4.0 preview releases, we have published the pyspark-connect
>>       package for the thin client, we will need to rename it in the official 
>> 4.0
>>       release.
>>       - pyspark-connect: Spark Connect enabled by default.
>>    - An additional tarball will be added to the Spark 4.0 download page
>>    with updated scripts (spark-submit, spark-shell, etc.) to enable Spark
>>    Connect by default.
>>    - A new Docker image will be provided with Spark Connect enabled by
>>    default.
>>
>> By taking this approach, we can make Spark Connect more visible and
>> accessible to users, which is more effective than simply asking them to
>> configure it manually.
>>
>> Looking forward to hearing your thoughts!
>>
>> Thanks,
>> Wenchen
>>
>

Reply via email to