+1 On Wed, Apr 9, 2025 at 9:37 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> +1 > > Dr Mich Talebzadeh, > Architect | Data Science | Financial Crime | Forensic Analysis | GDPR > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > > > On Wed, 9 Apr 2025 at 08:07, Peter Toth <peter.t...@gmail.com> wrote: > >> +1 >> >> On Wed, Apr 9, 2025 at 8:51 AM Cheng Pan <pan3...@gmail.com> wrote: >> >>> +1 (non-binding) >>> >>> Glad to see Spark SQL extended to streaming use cases. >>> >>> Thanks, >>> Cheng Pan >>> >>> >>> >>> On Apr 9, 2025, at 14:43, Anton Okolnychyi <aokolnyc...@gmail.com> >>> wrote: >>> >>> +1 >>> >>> вт, 8 квіт. 2025 р. о 23:36 Jacky Lee <qcsd2...@gmail.com> пише: >>> >>>> +1 I'm delighted that it will be open-sourced, enabling greater >>>> integration with Iceberg/Delta to unlock more value. >>>> >>>> Jungtaek Lim <kabhwan.opensou...@gmail.com> 于2025年4月9日周三 10:47写道: >>>> > >>>> > +1 looking forward to seeing this make progress! >>>> > >>>> > On Wed, Apr 9, 2025 at 11:32 AM Yang Jie <yangji...@apache.org> >>>> wrote: >>>> >> >>>> >> +1 >>>> >> >>>> >> On 2025/04/09 01:07:57 Hyukjin Kwon wrote: >>>> >> > +1 >>>> >> > >>>> >> > I am actually pretty excited to have this. Happy to see this being >>>> proposed. >>>> >> > >>>> >> > On Wed, 9 Apr 2025 at 01:55, Chao Sun <sunc...@apache.org> wrote: >>>> >> > >>>> >> > > +1. Super excited about this effort! >>>> >> > > >>>> >> > > On Tue, Apr 8, 2025 at 9:47 AM huaxin gao < >>>> huaxin.ga...@gmail.com> wrote: >>>> >> > > >>>> >> > >> +1 I support this SPIP because it simplifies data pipeline >>>> management and >>>> >> > >> enhances error detection. >>>> >> > >> >>>> >> > >> >>>> >> > >> On Tue, Apr 8, 2025 at 9:33 AM Dilip Biswal <dkbis...@gmail.com> >>>> wrote: >>>> >> > >> >>>> >> > >>> Excited to see this heading toward open source — materialized >>>> views and >>>> >> > >>> other features will bring a lot of value. >>>> >> > >>> +1 (non-binding) >>>> >> > >>> >>>> >> > >>> On Mon, Apr 7, 2025 at 10:37 AM Sandy Ryza <sa...@apache.org> >>>> wrote: >>>> >> > >>> >>>> >> > >>>> Hi Khalid – the CLI in the current proposal will need to be >>>> built on >>>> >> > >>>> top of internal APIs for constructing and launching pipeline >>>> executions. >>>> >> > >>>> We'll have the option to expose these in the future. >>>> >> > >>>> >>>> >> > >>>> It would be worthwhile to understand the use cases in more >>>> depth before >>>> >> > >>>> exposing these, because APIs are one-way doors and can be >>>> costly to >>>> >> > >>>> maintain. >>>> >> > >>>> >>>> >> > >>>> On Sat, Apr 5, 2025 at 11:59 PM Khalid Mammadov < >>>> >> > >>>> khalidmammad...@gmail.com> wrote: >>>> >> > >>>> >>>> >> > >>>>> Looks great! >>>> >> > >>>>> QQ: will user able to run this pipeline from normal code? >>>> I.e. can I >>>> >> > >>>>> trigger a pipeline from *driver* code based on some >>>> condition etc. or >>>> >> > >>>>> it must be executed via separate shell command ? >>>> >> > >>>>> As a background Databricks imposes similar limitation where >>>> as you >>>> >> > >>>>> cannot run normal Spark code and DLT on the same cluster for >>>> some reason >>>> >> > >>>>> and forces to use two clusters increasing the cost and >>>> latency. >>>> >> > >>>>> >>>> >> > >>>>> On Sat, 5 Apr 2025 at 23:03, Sandy Ryza <sa...@apache.org> >>>> wrote: >>>> >> > >>>>> >>>> >> > >>>>>> Hi all – starting a discussion thread for a SPIP that I've >>>> been >>>> >> > >>>>>> working on with Chao Sun, Kent Yao, Yuming Wang, and Jie >>>> Yang: [JIRA >>>> >> > >>>>>> <https://issues.apache.org/jira/browse/SPARK-51727>] [Doc >>>> >> > >>>>>> < >>>> https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0 >>>> > >>>> >> > >>>>>> ]. >>>> >> > >>>>>> >>>> >> > >>>>>> The SPIP proposes extending Spark's lazy, declarative >>>> execution model >>>> >> > >>>>>> beyond single queries, to pipelines that keep multiple >>>> datasets up to date. >>>> >> > >>>>>> It introduces the ability to compose multiple >>>> transformations into a single >>>> >> > >>>>>> declarative dataflow graph. >>>> >> > >>>>>> >>>> >> > >>>>>> Declarative pipelines aim to simplify the development and >>>> management >>>> >> > >>>>>> of data pipelines, by removing the need for manual >>>> orchestration of >>>> >> > >>>>>> dependencies and making it possible to catch many errors >>>> before any >>>> >> > >>>>>> execution steps are launched. >>>> >> > >>>>>> >>>> >> > >>>>>> Declarative pipelines can include both batch and streaming >>>> >> > >>>>>> computations, leveraging Structured Streaming for stream >>>> processing and new >>>> >> > >>>>>> materialized view syntax for batch processing. Tight >>>> integration with Spark >>>> >> > >>>>>> SQL's analyzer enables deeper analysis and earlier error >>>> detection than is >>>> >> > >>>>>> achievable with more generic frameworks. >>>> >> > >>>>>> >>>> >> > >>>>>> Let us know what you think! >>>> >> > >>>>>> >>>> >> > >>>>>> >>>> >> > >>>> >> >>>> >> --------------------------------------------------------------------- >>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>> >>>