Re: [DISCUSS] Move Spark Connect server to builtin package (Client API layer stays external)

Matthew Powers Tue, 02 Jul 2024 04:20:17 -0700

This is a great idea and would be a great quality of life improvement.

+1 (non-binding)


On Tue, Jul 2, 2024 at 4:56 AM Hyukjin Kwon <[email protected]> wrote:

> > while leaving the connect jvm client in a separate folder looks weird
>
> I plan to actually put it at the top level together but I feel like this
> has to be done with SPIP so I am moving internal server side first
> orthogonally
>
> On Tue, 2 Jul 2024 at 17:54, Cheng Pan <[email protected]> wrote:
>
>> Thanks for raising this discussion, I think putting the connect folder on
>> the top level is a good idea to promote Spark Connect, while leaving the
>> connect jvm client in a separate folder looks weird. I suppose there is no
>> contract to leave all optional modules under `connector`? e.g.
>> `resource-managers/kubernetes/{docker,integration-tests}`, `hadoop-cloud`.
>> What about moving the whole `connect` folder to the top level?
>>
>> Thanks,
>> Cheng Pan
>>
>>
>> On Jul 2, 2024, at 08:19, Hyukjin Kwon <[email protected]> wrote:
>>
>> Hi all,
>>
>> I would like to discuss moving Spark Connect server to builtin package.
>> Right now, users have to specify —packages when they run Spark Connect
>> server script, for example:
>>
>> ./sbin/start-connect-server.sh --jars `ls 
>> connector/connect/server/target/**/spark-connect*SNAPSHOT.jar`
>>
>> or
>>
>> ./sbin/start-connect-server.sh --packages 
>> org.apache.spark:spark-connect_2.12:3.5.1
>>
>> which is a little bit odd that sbin scripts should provide jars to start.
>>
>> Moving it to builtin package is pretty straightforward because most of
>> jars are shaded, and the impact would be minimal, I have a prototype here
>> apache/spark/#47157 <https://github.com/apache/spark/pull/47157>. This
>> also simplifies Python local running logic a lot.
>>
>> User facing API layer, Spark Connect Client, stays external but I would
>> like the internal/admin server layer, Spark Connect Server, implementation
>> to be built in Spark.
>>
>> Please let me know if you have thoughts on this!
>>
>>
>>

Re: [DISCUSS] Move Spark Connect server to builtin package (Client API layer stays external)

Reply via email to