Re: [VOTE] SPIP: JDBC Driver for Spark Connect

Mich Talebzadeh Tue, 23 Sep 2025 15:25:18 -0700

well. I see Nimrod has a valid point here. one can hide “Spark Connect vs
classic Spark SQL (HiveServer2/Beeline/Thrift) from users by putting a thin
abstraction in front.
Once that abstraction layer is there, then the hand over becomes
transparent. Your app talks to a Spark Connect endpoint, the Spark driver
runs in the cluster. Works with DataFrame API and SQL.
In essence you design a small library (or service) that exposes one API to
users and chooses the backend under the bonnet.


HTH

Dr Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>





On Tue, 23 Sept 2025 at 21:40, Nimrod Ofek <[email protected]> wrote:

> Hi,
>
> That's the thing - I don't expect the users to know if they are connecting
> to Spark or Spark connect.
> That means I would expect the driver to support both Spark Connect - and
> the current Hive/ Beeline /Thrift server.
> Maybe even some regular Spark API (even a simple "jar" that just runs the
> query and returns the results somehow, haven't really thought about it in
> depth).
>
> The point I'm trying to make is that the user that wants to run a SQL
> command using JDBC - doesn't care if it's Spark or Spark connect or
> whatever - it's like a Database for the user...
>
> Regards,
> Nimrod
>
>
> On Mon, Sep 22, 2025 at 5:24 PM Cheng Pan <[email protected]> wrote:
>
>> Hi Nimrod,
>>
>> I'm not sure I get your question. Maybe the name should be 'JDBC
>> Driver for Spark Connect Server'?
>>
>> From the user's perspective, they simply use a JDBC driver to connect
>> to the Connect Server to run SQL
>> and retrieve the results, without having to worry about whether Spark
>> is running in classic or Connect mode.
>>
>> Thanks,
>> Cheng Pan
>>
>> On Mon, Sep 22, 2025 at 10:17 PM Nimrod Ofek <[email protected]>
>> wrote:
>> >
>> > I'll raise an issue with this- I don't think the user that uses jdbc to
>> Spark should know if he is working with Spark connect or regular Spark....
>> > The jdbc driver should know how to work with connect with fallback
>> maybe, but the user doesn't care if he is getting Spark connect or not...
>> >
>> > Regards,
>> > Nimrod
>> >
>> > בתאריך יום ב׳, 22 בספט׳ 2025, 16:04, מאת 杨杰 ‏<[email protected]>:
>> >>
>> >> Hi Spark devs,
>> >>
>> >> I would like to start a vote on the SPIP: JDBC Driver for Spark Connect
>> >>
>> >> Discussion thread:
>> >> https://lists.apache.org/thread/rx5pqh01c86slpqv9161hqwgm5lwxxzq
>> >> SPIP:
>> >>
>> https://docs.google.com/document/d/1Ahk4C16o1Jj1TbLg5ylzgHjvu2Ic2zTrcMuvLjqSoAQ/edit?tab=t.0#heading=h.1gf0bimgty0t
>> >> JIRA: https://issues.apache.org/jira/browse/SPARK-53484
>> >>
>> >> Please vote on the SPIP for the next 72 hours:
>> >>
>> >> [ ] +1: Accept the proposal as an official SPIP
>> >> [ ] +0
>> >> [ ] -1: I don’t think this is a good idea because
>>
>

Re: [VOTE] SPIP: JDBC Driver for Spark Connect

Reply via email to