Sure we leave it as it is. No big deal Dr Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> On Tue, 4 Mar 2025 at 23:29, Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > Thanks for catching this (unfortunately this is already checked in, so > changing the PR description doesn't reflect the commit). > > But I'd say, let's focus on the discussion and let's not try to nitpick. > If we are really concerned, I can modify the commit message when we decide > to push this to Spark 4.0.0 to at least correct this in Spark 4.0.0, but > this sounds to me as a whole different topic. > > On Wed, Mar 5, 2025 at 7:02 AM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Thanks, >> >> I read PySpark pull. I suggest this >> Why are the changes needed? >> >> As Spark connect is becoming the default *API *in spark 4.0, we need to >> add connect support for TWS in Python. >> Why: >> >> Saying this "As Spark Connect is becoming* the default AP*I in Spark >> 4.0" reflects more accurately that Spark Connect is an interface for >> interacting with Spark, not a replacement for the entire system. >> >> HTH >> .. >> >> >> Dr Mich Talebzadeh, >> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> >> >> On Tue, 4 Mar 2025 at 20:35, Jungtaek Lim <kabhwan.opensou...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Here are PRs we are seeking for consensus to get in for 4.0. >>> >>> PySpark: https://github.com/apache/spark/pull/49560 >>> Scala: https://github.com/apache/spark/pull/49488 >>> >>> Thanks, >>> Jungtaek Lim (HeartSaVioR) >>> >>> >>> On Tue, Mar 4, 2025 at 11:06 PM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Thanks. >>>> >>>> Can you point to a link or any further documentation please? >>>> >>>> Dr Mich Talebzadeh, >>>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, 4 Mar 2025 at 13:22, Herman van Hovell >>>> <her...@databricks.com.invalid> wrote: >>>> >>>>> +1 >>>>> >>>>> On Tue, Mar 4, 2025 at 2:07 AM Anish Shrigondekar >>>>> <anish.shrigonde...@databricks.com.invalid> wrote: >>>>> >>>>>> +1 - Would be great to get this into the Spark 4.0 release. >>>>>> >>>>>> Thanks, >>>>>> Anish >>>>>> >>>>>> On Mon, Mar 3, 2025 at 9:35 PM Jungtaek Lim < >>>>>> kabhwan.opensou...@gmail.com> wrote: >>>>>> >>>>>>> Hi dev, >>>>>>> >>>>>>> We are going to introduce a new API named `transformWithState` for >>>>>>> streaming query, which allows users to perform more complex stateful >>>>>>> operation in user function, with lot simpler code compared to >>>>>>> `flatMapGroupsWithState` (and `applyInPandasWithState`). >>>>>>> >>>>>>> The target version has been Spark 4.0.0 and we track this project as >>>>>>> a major one for Spark 4. We push most planned features into Spark 4.0.0, >>>>>>> except Spark Connect support. >>>>>>> >>>>>>> The PRs for Spark Connect support are merged into Spark 4.1 branch, >>>>>>> but I'm seeking the voice whether we can introduce Spark Connect >>>>>>> support to >>>>>>> Spark 4.0.0. >>>>>>> >>>>>>> I understand this arrives a bit late, but since the API is something >>>>>>> backed by a huge effort and I foresee this new API to replace the usage >>>>>>> of >>>>>>> flatMapGroupsWithState and applyInPandasWithState sooner, I'd like to >>>>>>> make >>>>>>> sure we don't push users back to wait for another 6+ months to use this >>>>>>> in >>>>>>> Spark Connect. >>>>>>> >>>>>>> Would love to hear your thoughts. >>>>>>> >>>>>>> Thanks, >>>>>>> Jungtaek Lim (HeartSaVioR) >>>>>>> >>>>>>