Given there is no plan to support RDDs I’ll update to -0.9

Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her


On Thu, Nov 28, 2024 at 6:00 PM Herman van Hovell <her...@databricks.com>
wrote:

> Hi Holden and Mridul,
>
> Just to be clear. What API parity are you expecting here? We have parity
> for everything that is exposed in org.apache.spark.sql. Connect does not
> support RDDs, SparkContext, etc... There are currently no plans to
> support this. We are considering adding a compatibility layer but that will
> be limited in scope. From running Connect in production for the last year,
> we see that most users can migrate their workloads without any problems.
>
> I do want to call out that this proposal is mostly aimed at how new users
> will interact with Spark. Existing users, when they migrate their
> application to Spark 4, have to set a conf when it turns out their
> application is not working. This should be a minor inconvenience compared
> to the headaches that a new Scala version or other library upgrades can
> cause.
>
> Since this is a breaking change, I do think this should be done in a major
> version.
>
> With the risk of repeating the SPIP, using Connect as the default brings a
> lot to the table (e.g. simplicity, easier upgrades, extensibility, etc...),
> I'd urge you to also factor this into your decision making.
>
> Happy thanksgiving!
>
> Cheers,
> Herman
>
> On Thu, Nov 28, 2024 at 8:43 PM Mridul Muralidharan <mri...@gmail.com>
> wrote:
>
>> Hi,
>>
>>   I agree with Holden, I am leaning -1 on the proposal as well.
>> Unlike removal of deprecated features, which we align on a major version
>> boundary, changing the default is something we can do in a minor version as
>> well - once there is api parity.
>>
>> Irrespective of which major/minor version we make the switch in - there
>> could be user impact; minimizing this impact would be greatly appreciated
>> by our users.
>>
>> Regards,
>> Mridul
>>
>>
>>
>> On Wed, Nov 27, 2024 at 8:31 PM Holden Karau <holden.ka...@gmail.com>
>> wrote:
>>
>>> -0.5: I don’t think this a good idea for JVM apps until we have API
>>> parity. (Binding but to be clear not a veto)
>>>
>>> Twitter: https://twitter.com/holdenkarau
>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>> Pronouns: she/her
>>>
>>>
>>> On Wed, Nov 27, 2024 at 6:27 PM Xinrong Meng <xinr...@apache.org> wrote:
>>>
>>>> +1
>>>>
>>>> Thank you Herman!
>>>>
>>>> On Thu, Nov 28, 2024 at 3:37 AM Dongjoon Hyun <dongjoon.h...@gmail.com>
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Wed, Nov 27, 2024 at 09:16 Denny Lee <denny.g....@gmail.com> wrote:
>>>>>
>>>>>> +1 (non-binding)
>>>>>>
>>>>>> On Wed, Nov 27, 2024 at 3:07 AM Martin Grund
>>>>>> <mar...@databricks.com.invalid> wrote:
>>>>>>
>>>>>>> As part of the discussion on this topic, I would love to highlight
>>>>>>> the work that the community is currently doing to support SparkML, 
>>>>>>> which is
>>>>>>> traditionally very RDD-heavy, natively in Spark Connect. Bobby's awesome
>>>>>>> work shows that, over time, we can extend the features of Spark Connect 
>>>>>>> and
>>>>>>> support workloads that we previously thought could not be supported 
>>>>>>> easily.
>>>>>>>
>>>>>>> https://github.com/apache/spark/pull/48791
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>> On Wed, Nov 27, 2024 at 11:42 AM Yang,Jie(INF)
>>>>>>> <yangji...@baidu.com.invalid> wrote:
>>>>>>>
>>>>>>>> +1
>>>>>>>> -------- 原始邮件 --------
>>>>>>>> 发件人:Hyukjin Kwon<gurwls...@apache.org>
>>>>>>>> 时间:2024-11-27 08:04:06
>>>>>>>> 主题:[外部邮件] Re: Spark Connect the default API in Spark 4.0
>>>>>>>> 收件人:Bjørn Jørgensen<bjornjorgen...@gmail.com>;
>>>>>>>> 抄送人:Herman van Hovell<her...@databricks.com.invalid>;Spark dev
>>>>>>>> list<dev@spark.apache.org>;
>>>>>>>> +1
>>>>>>>>
>>>>>>>> On Mon, 25 Nov 2024 at 23:33, Bjørn Jørgensen <
>>>>>>>> bjornjorgen...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> +1
>>>>>>>>>
>>>>>>>>> man. 25. nov. 2024 kl. 14:48 skrev Herman van Hovell
>>>>>>>>> <her...@databricks.com.invalid>:
>>>>>>>>>
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> I would like to start a discussion on "Spark Connect the default
>>>>>>>>>> API in Spark 4.0".
>>>>>>>>>>
>>>>>>>>>> The rationale for this change is that Spark Connect brings a lot
>>>>>>>>>> of improvements with respect to simplicity, stability, isolation,
>>>>>>>>>> upgradability, and extensibility (all detailed in the SPIP). In a 
>>>>>>>>>> nutshell:
>>>>>>>>>> we want to introduce a flag, spark.api.mode, that allows a user
>>>>>>>>>> to choose between classic or connect mode, the default being
>>>>>>>>>> connect. A user can easily fallback to Classic by setting
>>>>>>>>>> spark.api.mode to classic.
>>>>>>>>>>
>>>>>>>>>> SPIP:
>>>>>>>>>> https://docs.google.com/document/d/1C0kuQEliG78HujVwdnSk0wjNwHEDdwo2o8aVq7kbhTo/edit?tab=t.0#heading=h.r2c3xrbiklu3
>>>>>>>>>> <https://mailshield.baidu.com/check?q=5uIK5BsJhkKEitTyTno8Yb7Zq%2boLHvRsgSoBr5oTNJEHXWS9Np0U8pCuv2DeJDfCQJiI52FAoCrxDEqnj1jOqX9A3jtJcetvkKkKE696xfrLfKuuRuyCC9YrwN5IW4OUtkhdHz7C%2bER2GN9EPqnlIlX2osm36Zbn>
>>>>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-50411
>>>>>>>>>> <https://mailshield.baidu.com/check?q=vc5arXeK3OKfjk5Oxe1F%2fMNjR%2fSx5pTdbaOArWe9m2MpZDOF702CYYagPMQmbDqV7xnWwxsUdOc%3d>
>>>>>>>>>>
>>>>>>>>>> I am looking forward to your feedback!
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Herman
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Bjørn Jørgensen
>>>>>>>>> Vestre Aspehaug 4, 6010 Ålesund
>>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g>
>>>>>>>>> Norge
>>>>>>>>> <https://www.google.com/maps/search/Vestre+Aspehaug+4,+6010+%C3%85lesund++%0D%0ANorge?entry=gmail&source=g>
>>>>>>>>>
>>>>>>>>> +47 480 94 297
>>>>>>>>>
>>>>>>>>

Reply via email to