Given there is no plan to support RDDs I’ll update to -0.9 Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ <https://www.fighthealthinsurance.com/?q=hk_email> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau Pronouns: she/her
On Thu, Nov 28, 2024 at 6:00 PM Herman van Hovell <her...@databricks.com> wrote: > Hi Holden and Mridul, > > Just to be clear. What API parity are you expecting here? We have parity > for everything that is exposed in org.apache.spark.sql. Connect does not > support RDDs, SparkContext, etc... There are currently no plans to > support this. We are considering adding a compatibility layer but that will > be limited in scope. From running Connect in production for the last year, > we see that most users can migrate their workloads without any problems. > > I do want to call out that this proposal is mostly aimed at how new users > will interact with Spark. Existing users, when they migrate their > application to Spark 4, have to set a conf when it turns out their > application is not working. This should be a minor inconvenience compared > to the headaches that a new Scala version or other library upgrades can > cause. > > Since this is a breaking change, I do think this should be done in a major > version. > > With the risk of repeating the SPIP, using Connect as the default brings a > lot to the table (e.g. simplicity, easier upgrades, extensibility, etc...), > I'd urge you to also factor this into your decision making. > > Happy thanksgiving! > > Cheers, > Herman > > On Thu, Nov 28, 2024 at 8:43 PM Mridul Muralidharan <mri...@gmail.com> > wrote: > >> Hi, >> >> I agree with Holden, I am leaning -1 on the proposal as well. >> Unlike removal of deprecated features, which we align on a major version >> boundary, changing the default is something we can do in a minor version as >> well - once there is api parity. >> >> Irrespective of which major/minor version we make the switch in - there >> could be user impact; minimizing this impact would be greatly appreciated >> by our users. >> >> Regards, >> Mridul >> >> >> >> On Wed, Nov 27, 2024 at 8:31 PM Holden Karau <holden.ka...@gmail.com> >> wrote: >> >>> -0.5: I don’t think this a good idea for JVM apps until we have API >>> parity. (Binding but to be clear not a veto) >>> >>> Twitter: https://twitter.com/holdenkarau >>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>> <https://www.fighthealthinsurance.com/?q=hk_email> >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> Pronouns: she/her >>> >>> >>> On Wed, Nov 27, 2024 at 6:27 PM Xinrong Meng <xinr...@apache.org> wrote: >>> >>>> +1 >>>> >>>> Thank you Herman! >>>> >>>> On Thu, Nov 28, 2024 at 3:37 AM Dongjoon Hyun <dongjoon.h...@gmail.com> >>>> wrote: >>>> >>>>> +1 >>>>> >>>>> On Wed, Nov 27, 2024 at 09:16 Denny Lee <denny.g....@gmail.com> wrote: >>>>> >>>>>> +1 (non-binding) >>>>>> >>>>>> On Wed, Nov 27, 2024 at 3:07 AM Martin Grund >>>>>> <mar...@databricks.com.invalid> wrote: >>>>>> >>>>>>> As part of the discussion on this topic, I would love to highlight >>>>>>> the work that the community is currently doing to support SparkML, >>>>>>> which is >>>>>>> traditionally very RDD-heavy, natively in Spark Connect. Bobby's awesome >>>>>>> work shows that, over time, we can extend the features of Spark Connect >>>>>>> and >>>>>>> support workloads that we previously thought could not be supported >>>>>>> easily. >>>>>>> >>>>>>> https://github.com/apache/spark/pull/48791 >>>>>>> >>>>>>> Martin >>>>>>> >>>>>>> On Wed, Nov 27, 2024 at 11:42 AM Yang,Jie(INF) >>>>>>> <yangji...@baidu.com.invalid> wrote: >>>>>>> >>>>>>>> +1 >>>>>>>> -------- 原始邮件 -------- >>>>>>>> 发件人:Hyukjin Kwon<gurwls...@apache.org> >>>>>>>> 时间:2024-11-27 08:04:06 >>>>>>>> 主题:[外部邮件] Re: Spark Connect the default API in Spark 4.0 >>>>>>>> 收件人:Bjørn Jørgensen<bjornjorgen...@gmail.com>; >>>>>>>> 抄送人:Herman van Hovell<her...@databricks.com.invalid>;Spark dev >>>>>>>> list<dev@spark.apache.org>; >>>>>>>> +1 >>>>>>>> >>>>>>>> On Mon, 25 Nov 2024 at 23:33, Bjørn Jørgensen < >>>>>>>> bjornjorgen...@gmail.com> wrote: >>>>>>>> >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> man. 25. nov. 2024 kl. 14:48 skrev Herman van Hovell >>>>>>>>> <her...@databricks.com.invalid>: >>>>>>>>> >>>>>>>>>> Hi All, >>>>>>>>>> >>>>>>>>>> I would like to start a discussion on "Spark Connect the default >>>>>>>>>> API in Spark 4.0". >>>>>>>>>> >>>>>>>>>> The rationale for this change is that Spark Connect brings a lot >>>>>>>>>> of improvements with respect to simplicity, stability, isolation, >>>>>>>>>> upgradability, and extensibility (all detailed in the SPIP). In a >>>>>>>>>> nutshell: >>>>>>>>>> we want to introduce a flag, spark.api.mode, that allows a user >>>>>>>>>> to choose between classic or connect mode, the default being >>>>>>>>>> connect. A user can easily fallback to Classic by setting >>>>>>>>>> spark.api.mode to classic. >>>>>>>>>> >>>>>>>>>> SPIP: >>>>>>>>>> https://docs.google.com/document/d/1C0kuQEliG78HujVwdnSk0wjNwHEDdwo2o8aVq7kbhTo/edit?tab=t.0#heading=h.r2c3xrbiklu3 >>>>>>>>>> <https://mailshield.baidu.com/check?q=5uIK5BsJhkKEitTyTno8Yb7Zq%2boLHvRsgSoBr5oTNJEHXWS9Np0U8pCuv2DeJDfCQJiI52FAoCrxDEqnj1jOqX9A3jtJcetvkKkKE696xfrLfKuuRuyCC9YrwN5IW4OUtkhdHz7C%2bER2GN9EPqnlIlX2osm36Zbn> >>>>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-50411 >>>>>>>>>> <https://mailshield.baidu.com/check?q=vc5arXeK3OKfjk5Oxe1F%2fMNjR%2fSx5pTdbaOArWe9m2MpZDOF702CYYagPMQmbDqV7xnWwxsUdOc%3d> >>>>>>>>>> >>>>>>>>>> I am looking forward to your feedback! >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Herman >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Bjørn Jørgensen >>>>>>>>> Vestre Aspehaug 4, 6010 Ålesund >>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g> >>>>>>>>> Norge >>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g> >>>>>>>>> >>>>>>>>> +47 480 94 297 >>>>>>>>> >>>>>>>>