Sure we leave it as it is. No big deal

Dr Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>





On Tue, 4 Mar 2025 at 23:29, Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> Thanks for catching this (unfortunately this is already checked in, so
> changing the PR description doesn't reflect the commit).
>
> But I'd say, let's focus on the discussion and let's not try to nitpick.
> If we are really concerned, I can modify the commit message when we decide
> to push this to Spark 4.0.0 to at least correct this in Spark 4.0.0, but
> this sounds to me as a whole different topic.
>
> On Wed, Mar 5, 2025 at 7:02 AM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Thanks,
>>
>> I read PySpark pull. I suggest this
>> Why are the changes needed?
>>
>> As Spark connect is becoming the default *API *in spark 4.0, we need to
>> add connect support for TWS in Python.
>> Why:
>>
>> Saying this "As Spark Connect is becoming* the default AP*I in Spark
>> 4.0" reflects more accurately that Spark Connect is an interface for
>> interacting with Spark, not a replacement for the entire system.
>>
>> HTH
>> ..
>>
>>
>> Dr Mich Talebzadeh,
>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>>
>>
>> On Tue, 4 Mar 2025 at 20:35, Jungtaek Lim <kabhwan.opensou...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Here are PRs we are seeking for consensus to get in for 4.0.
>>>
>>> PySpark: https://github.com/apache/spark/pull/49560
>>> Scala: https://github.com/apache/spark/pull/49488
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>> On Tue, Mar 4, 2025 at 11:06 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> Thanks.
>>>>
>>>> Can you point to a link or any further documentation please?
>>>>
>>>> Dr Mich Talebzadeh,
>>>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, 4 Mar 2025 at 13:22, Herman van Hovell
>>>> <her...@databricks.com.invalid> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Tue, Mar 4, 2025 at 2:07 AM Anish Shrigondekar
>>>>> <anish.shrigonde...@databricks.com.invalid> wrote:
>>>>>
>>>>>> +1 - Would be great to get this into the Spark 4.0 release.
>>>>>>
>>>>>> Thanks,
>>>>>> Anish
>>>>>>
>>>>>> On Mon, Mar 3, 2025 at 9:35 PM Jungtaek Lim <
>>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi dev,
>>>>>>>
>>>>>>> We are going to introduce a new API named `transformWithState` for
>>>>>>> streaming query, which allows users to perform more complex stateful
>>>>>>> operation in user function, with lot simpler code compared to
>>>>>>> `flatMapGroupsWithState` (and `applyInPandasWithState`).
>>>>>>>
>>>>>>> The target version has been Spark 4.0.0 and we track this project as
>>>>>>> a major one for Spark 4. We push most planned features into Spark 4.0.0,
>>>>>>> except Spark Connect support.
>>>>>>>
>>>>>>> The PRs for Spark Connect support are merged into Spark 4.1 branch,
>>>>>>> but I'm seeking the voice whether we can introduce Spark Connect 
>>>>>>> support to
>>>>>>> Spark 4.0.0.
>>>>>>>
>>>>>>> I understand this arrives a bit late, but since the API is something
>>>>>>> backed by a huge effort and I foresee this new API to replace the usage 
>>>>>>> of
>>>>>>> flatMapGroupsWithState and applyInPandasWithState sooner, I'd like to 
>>>>>>> make
>>>>>>> sure we don't push users back to wait for another 6+ months to use this 
>>>>>>> in
>>>>>>> Spark Connect.
>>>>>>>
>>>>>>> Would love to hear your thoughts.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>>
>>>>>>

Reply via email to