> while leaving the connect jvm client in a separate folder looks weird

I plan to actually put it at the top level together but I feel like this
has to be done with SPIP so I am moving internal server side first
orthogonally

On Tue, 2 Jul 2024 at 17:54, Cheng Pan <pan3...@gmail.com> wrote:

> Thanks for raising this discussion, I think putting the connect folder on
> the top level is a good idea to promote Spark Connect, while leaving the
> connect jvm client in a separate folder looks weird. I suppose there is no
> contract to leave all optional modules under `connector`? e.g.
> `resource-managers/kubernetes/{docker,integration-tests}`, `hadoop-cloud`.
> What about moving the whole `connect` folder to the top level?
>
> Thanks,
> Cheng Pan
>
>
> On Jul 2, 2024, at 08:19, Hyukjin Kwon <gurwls...@apache.org> wrote:
>
> Hi all,
>
> I would like to discuss moving Spark Connect server to builtin package.
> Right now, users have to specify —packages when they run Spark Connect
> server script, for example:
>
> ./sbin/start-connect-server.sh --jars `ls 
> connector/connect/server/target/**/spark-connect*SNAPSHOT.jar`
>
> or
>
> ./sbin/start-connect-server.sh --packages 
> org.apache.spark:spark-connect_2.12:3.5.1
>
> which is a little bit odd that sbin scripts should provide jars to start.
>
> Moving it to builtin package is pretty straightforward because most of
> jars are shaded, and the impact would be minimal, I have a prototype here
> apache/spark/#47157 <https://github.com/apache/spark/pull/47157>. This
> also simplifies Python local running logic a lot.
>
> User facing API layer, Spark Connect Client, stays external but I would
> like the internal/admin server layer, Spark Connect Server, implementation
> to be built in Spark.
>
> Please let me know if you have thoughts on this!
>
>
>

Reply via email to