I would switch to +0 if the default of connect was only for apps without
any user provided jars/non-JVM apps.

Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her


On Thu, Nov 28, 2024 at 6:11 PM Holden Karau <holden.ka...@gmail.com> wrote:

> Given there is no plan to support RDDs I’ll update to -0.9
>
>
> Twitter: https://twitter.com/holdenkarau
> Fight Health Insurance: https://www.fighthealthinsurance.com/
> <https://www.fighthealthinsurance.com/?q=hk_email>
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> Pronouns: she/her
>
>
> On Thu, Nov 28, 2024 at 6:00 PM Herman van Hovell <her...@databricks.com>
> wrote:
>
>> Hi Holden and Mridul,
>>
>> Just to be clear. What API parity are you expecting here? We have parity
>> for everything that is exposed in org.apache.spark.sql. Connect does not
>> support RDDs, SparkContext, etc... There are currently no plans to
>> support this. We are considering adding a compatibility layer but that will
>> be limited in scope. From running Connect in production for the last year,
>> we see that most users can migrate their workloads without any problems.
>>
>> I do want to call out that this proposal is mostly aimed at how new users
>> will interact with Spark. Existing users, when they migrate their
>> application to Spark 4, have to set a conf when it turns out their
>> application is not working. This should be a minor inconvenience compared
>> to the headaches that a new Scala version or other library upgrades can
>> cause.
>>
>> Since this is a breaking change, I do think this should be done in a
>> major version.
>>
>> With the risk of repeating the SPIP, using Connect as the default brings
>> a lot to the table (e.g. simplicity, easier upgrades, extensibility,
>> etc...), I'd urge you to also factor this into your decision making.
>>
>> Happy thanksgiving!
>>
>> Cheers,
>> Herman
>>
>> On Thu, Nov 28, 2024 at 8:43 PM Mridul Muralidharan <mri...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>>   I agree with Holden, I am leaning -1 on the proposal as well.
>>> Unlike removal of deprecated features, which we align on a major version
>>> boundary, changing the default is something we can do in a minor version as
>>> well - once there is api parity.
>>>
>>> Irrespective of which major/minor version we make the switch in - there
>>> could be user impact; minimizing this impact would be greatly appreciated
>>> by our users.
>>>
>>> Regards,
>>> Mridul
>>>
>>>
>>>
>>> On Wed, Nov 27, 2024 at 8:31 PM Holden Karau <holden.ka...@gmail.com>
>>> wrote:
>>>
>>>> -0.5: I don’t think this a good idea for JVM apps until we have API
>>>> parity. (Binding but to be clear not a veto)
>>>>
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>> Pronouns: she/her
>>>>
>>>>
>>>> On Wed, Nov 27, 2024 at 6:27 PM Xinrong Meng <xinr...@apache.org>
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> Thank you Herman!
>>>>>
>>>>> On Thu, Nov 28, 2024 at 3:37 AM Dongjoon Hyun <dongjoon.h...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>>
>>>>>> On Wed, Nov 27, 2024 at 09:16 Denny Lee <denny.g....@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 (non-binding)
>>>>>>>
>>>>>>> On Wed, Nov 27, 2024 at 3:07 AM Martin Grund
>>>>>>> <mar...@databricks.com.invalid> wrote:
>>>>>>>
>>>>>>>> As part of the discussion on this topic, I would love to highlight
>>>>>>>> the work that the community is currently doing to support SparkML, 
>>>>>>>> which is
>>>>>>>> traditionally very RDD-heavy, natively in Spark Connect. Bobby's 
>>>>>>>> awesome
>>>>>>>> work shows that, over time, we can extend the features of Spark 
>>>>>>>> Connect and
>>>>>>>> support workloads that we previously thought could not be supported 
>>>>>>>> easily.
>>>>>>>>
>>>>>>>> https://github.com/apache/spark/pull/48791
>>>>>>>>
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> On Wed, Nov 27, 2024 at 11:42 AM Yang,Jie(INF)
>>>>>>>> <yangji...@baidu.com.invalid> wrote:
>>>>>>>>
>>>>>>>>> +1
>>>>>>>>> -------- 原始邮件 --------
>>>>>>>>> 发件人:Hyukjin Kwon<gurwls...@apache.org>
>>>>>>>>> 时间:2024-11-27 08:04:06
>>>>>>>>> 主题:[外部邮件] Re: Spark Connect the default API in Spark 4.0
>>>>>>>>> 收件人:Bjørn Jørgensen<bjornjorgen...@gmail.com>;
>>>>>>>>> 抄送人:Herman van Hovell<her...@databricks.com.invalid>;Spark dev
>>>>>>>>> list<dev@spark.apache.org>;
>>>>>>>>> +1
>>>>>>>>>
>>>>>>>>> On Mon, 25 Nov 2024 at 23:33, Bjørn Jørgensen <
>>>>>>>>> bjornjorgen...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>>
>>>>>>>>>> man. 25. nov. 2024 kl. 14:48 skrev Herman van Hovell
>>>>>>>>>> <her...@databricks.com.invalid>:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I would like to start a discussion on "Spark Connect the default
>>>>>>>>>>> API in Spark 4.0".
>>>>>>>>>>>
>>>>>>>>>>> The rationale for this change is that Spark Connect brings a lot
>>>>>>>>>>> of improvements with respect to simplicity, stability, isolation,
>>>>>>>>>>> upgradability, and extensibility (all detailed in the SPIP). In a 
>>>>>>>>>>> nutshell:
>>>>>>>>>>> we want to introduce a flag, spark.api.mode, that allows a user
>>>>>>>>>>> to choose between classic or connect mode, the default being
>>>>>>>>>>> connect. A user can easily fallback to Classic by setting
>>>>>>>>>>> spark.api.mode to classic.
>>>>>>>>>>>
>>>>>>>>>>> SPIP:
>>>>>>>>>>> https://docs.google.com/document/d/1C0kuQEliG78HujVwdnSk0wjNwHEDdwo2o8aVq7kbhTo/edit?tab=t.0#heading=h.r2c3xrbiklu3
>>>>>>>>>>> <https://mailshield.baidu.com/check?q=5uIK5BsJhkKEitTyTno8Yb7Zq%2boLHvRsgSoBr5oTNJEHXWS9Np0U8pCuv2DeJDfCQJiI52FAoCrxDEqnj1jOqX9A3jtJcetvkKkKE696xfrLfKuuRuyCC9YrwN5IW4OUtkhdHz7C%2bER2GN9EPqnlIlX2osm36Zbn>
>>>>>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-50411
>>>>>>>>>>> <https://mailshield.baidu.com/check?q=vc5arXeK3OKfjk5Oxe1F%2fMNjR%2fSx5pTdbaOArWe9m2MpZDOF702CYYagPMQmbDqV7xnWwxsUdOc%3d>
>>>>>>>>>>>
>>>>>>>>>>> I am looking forward to your feedback!
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Herman
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Bjørn Jørgensen
>>>>>>>>>> Vestre Aspehaug 4, 6010 Ålesund
>>>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g>
>>>>>>>>>> Norge
>>>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g>
>>>>>>>>>>
>>>>>>>>>> +47 480 94 297
>>>>>>>>>>
>>>>>>>>>

Reply via email to