I would switch to +0 if the default of connect was only for apps without any user provided jars/non-JVM apps.
Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ <https://www.fighthealthinsurance.com/?q=hk_email> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau Pronouns: she/her On Thu, Nov 28, 2024 at 6:11 PM Holden Karau <holden.ka...@gmail.com> wrote: > Given there is no plan to support RDDs I’ll update to -0.9 > > > Twitter: https://twitter.com/holdenkarau > Fight Health Insurance: https://www.fighthealthinsurance.com/ > <https://www.fighthealthinsurance.com/?q=hk_email> > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > Pronouns: she/her > > > On Thu, Nov 28, 2024 at 6:00 PM Herman van Hovell <her...@databricks.com> > wrote: > >> Hi Holden and Mridul, >> >> Just to be clear. What API parity are you expecting here? We have parity >> for everything that is exposed in org.apache.spark.sql. Connect does not >> support RDDs, SparkContext, etc... There are currently no plans to >> support this. We are considering adding a compatibility layer but that will >> be limited in scope. From running Connect in production for the last year, >> we see that most users can migrate their workloads without any problems. >> >> I do want to call out that this proposal is mostly aimed at how new users >> will interact with Spark. Existing users, when they migrate their >> application to Spark 4, have to set a conf when it turns out their >> application is not working. This should be a minor inconvenience compared >> to the headaches that a new Scala version or other library upgrades can >> cause. >> >> Since this is a breaking change, I do think this should be done in a >> major version. >> >> With the risk of repeating the SPIP, using Connect as the default brings >> a lot to the table (e.g. simplicity, easier upgrades, extensibility, >> etc...), I'd urge you to also factor this into your decision making. >> >> Happy thanksgiving! >> >> Cheers, >> Herman >> >> On Thu, Nov 28, 2024 at 8:43 PM Mridul Muralidharan <mri...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I agree with Holden, I am leaning -1 on the proposal as well. >>> Unlike removal of deprecated features, which we align on a major version >>> boundary, changing the default is something we can do in a minor version as >>> well - once there is api parity. >>> >>> Irrespective of which major/minor version we make the switch in - there >>> could be user impact; minimizing this impact would be greatly appreciated >>> by our users. >>> >>> Regards, >>> Mridul >>> >>> >>> >>> On Wed, Nov 27, 2024 at 8:31 PM Holden Karau <holden.ka...@gmail.com> >>> wrote: >>> >>>> -0.5: I don’t think this a good idea for JVM apps until we have API >>>> parity. (Binding but to be clear not a veto) >>>> >>>> Twitter: https://twitter.com/holdenkarau >>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>> Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> Pronouns: she/her >>>> >>>> >>>> On Wed, Nov 27, 2024 at 6:27 PM Xinrong Meng <xinr...@apache.org> >>>> wrote: >>>> >>>>> +1 >>>>> >>>>> Thank you Herman! >>>>> >>>>> On Thu, Nov 28, 2024 at 3:37 AM Dongjoon Hyun <dongjoon.h...@gmail.com> >>>>> wrote: >>>>> >>>>>> +1 >>>>>> >>>>>> On Wed, Nov 27, 2024 at 09:16 Denny Lee <denny.g....@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> +1 (non-binding) >>>>>>> >>>>>>> On Wed, Nov 27, 2024 at 3:07 AM Martin Grund >>>>>>> <mar...@databricks.com.invalid> wrote: >>>>>>> >>>>>>>> As part of the discussion on this topic, I would love to highlight >>>>>>>> the work that the community is currently doing to support SparkML, >>>>>>>> which is >>>>>>>> traditionally very RDD-heavy, natively in Spark Connect. Bobby's >>>>>>>> awesome >>>>>>>> work shows that, over time, we can extend the features of Spark >>>>>>>> Connect and >>>>>>>> support workloads that we previously thought could not be supported >>>>>>>> easily. >>>>>>>> >>>>>>>> https://github.com/apache/spark/pull/48791 >>>>>>>> >>>>>>>> Martin >>>>>>>> >>>>>>>> On Wed, Nov 27, 2024 at 11:42 AM Yang,Jie(INF) >>>>>>>> <yangji...@baidu.com.invalid> wrote: >>>>>>>> >>>>>>>>> +1 >>>>>>>>> -------- 原始邮件 -------- >>>>>>>>> 发件人:Hyukjin Kwon<gurwls...@apache.org> >>>>>>>>> 时间:2024-11-27 08:04:06 >>>>>>>>> 主题:[外部邮件] Re: Spark Connect the default API in Spark 4.0 >>>>>>>>> 收件人:Bjørn Jørgensen<bjornjorgen...@gmail.com>; >>>>>>>>> 抄送人:Herman van Hovell<her...@databricks.com.invalid>;Spark dev >>>>>>>>> list<dev@spark.apache.org>; >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> On Mon, 25 Nov 2024 at 23:33, Bjørn Jørgensen < >>>>>>>>> bjornjorgen...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> +1 >>>>>>>>>> >>>>>>>>>> man. 25. nov. 2024 kl. 14:48 skrev Herman van Hovell >>>>>>>>>> <her...@databricks.com.invalid>: >>>>>>>>>> >>>>>>>>>>> Hi All, >>>>>>>>>>> >>>>>>>>>>> I would like to start a discussion on "Spark Connect the default >>>>>>>>>>> API in Spark 4.0". >>>>>>>>>>> >>>>>>>>>>> The rationale for this change is that Spark Connect brings a lot >>>>>>>>>>> of improvements with respect to simplicity, stability, isolation, >>>>>>>>>>> upgradability, and extensibility (all detailed in the SPIP). In a >>>>>>>>>>> nutshell: >>>>>>>>>>> we want to introduce a flag, spark.api.mode, that allows a user >>>>>>>>>>> to choose between classic or connect mode, the default being >>>>>>>>>>> connect. A user can easily fallback to Classic by setting >>>>>>>>>>> spark.api.mode to classic. >>>>>>>>>>> >>>>>>>>>>> SPIP: >>>>>>>>>>> https://docs.google.com/document/d/1C0kuQEliG78HujVwdnSk0wjNwHEDdwo2o8aVq7kbhTo/edit?tab=t.0#heading=h.r2c3xrbiklu3 >>>>>>>>>>> <https://mailshield.baidu.com/check?q=5uIK5BsJhkKEitTyTno8Yb7Zq%2boLHvRsgSoBr5oTNJEHXWS9Np0U8pCuv2DeJDfCQJiI52FAoCrxDEqnj1jOqX9A3jtJcetvkKkKE696xfrLfKuuRuyCC9YrwN5IW4OUtkhdHz7C%2bER2GN9EPqnlIlX2osm36Zbn> >>>>>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-50411 >>>>>>>>>>> <https://mailshield.baidu.com/check?q=vc5arXeK3OKfjk5Oxe1F%2fMNjR%2fSx5pTdbaOArWe9m2MpZDOF702CYYagPMQmbDqV7xnWwxsUdOc%3d> >>>>>>>>>>> >>>>>>>>>>> I am looking forward to your feedback! >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Herman >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Bjørn Jørgensen >>>>>>>>>> Vestre Aspehaug 4, 6010 Ålesund >>>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g> >>>>>>>>>> Norge >>>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g> >>>>>>>>>> >>>>>>>>>> +47 480 94 297 >>>>>>>>>> >>>>>>>>>