+1 (non-binding) Glad to see Spark SQL extended to streaming use cases.
Thanks, Cheng Pan > On Apr 9, 2025, at 14:43, Anton Okolnychyi <aokolnyc...@gmail.com> wrote: > > +1 > > вт, 8 квіт. 2025 р. о 23:36 Jacky Lee <qcsd2...@gmail.com > <mailto:qcsd2...@gmail.com>> пише: >> +1 I'm delighted that it will be open-sourced, enabling greater >> integration with Iceberg/Delta to unlock more value. >> >> Jungtaek Lim <kabhwan.opensou...@gmail.com >> <mailto:kabhwan.opensou...@gmail.com>> 于2025年4月9日周三 10:47写道: >> > >> > +1 looking forward to seeing this make progress! >> > >> > On Wed, Apr 9, 2025 at 11:32 AM Yang Jie <yangji...@apache.org >> > <mailto:yangji...@apache.org>> wrote: >> >> >> >> +1 >> >> >> >> On 2025/04/09 01:07:57 Hyukjin Kwon wrote: >> >> > +1 >> >> > >> >> > I am actually pretty excited to have this. Happy to see this being >> >> > proposed. >> >> > >> >> > On Wed, 9 Apr 2025 at 01:55, Chao Sun <sunc...@apache.org >> >> > <mailto:sunc...@apache.org>> wrote: >> >> > >> >> > > +1. Super excited about this effort! >> >> > > >> >> > > On Tue, Apr 8, 2025 at 9:47 AM huaxin gao <huaxin.ga...@gmail.com >> >> > > <mailto:huaxin.ga...@gmail.com>> wrote: >> >> > > >> >> > >> +1 I support this SPIP because it simplifies data pipeline >> >> > >> management and >> >> > >> enhances error detection. >> >> > >> >> >> > >> >> >> > >> On Tue, Apr 8, 2025 at 9:33 AM Dilip Biswal <dkbis...@gmail.com >> >> > >> <mailto:dkbis...@gmail.com>> wrote: >> >> > >> >> >> > >>> Excited to see this heading toward open source — materialized views >> >> > >>> and >> >> > >>> other features will bring a lot of value. >> >> > >>> +1 (non-binding) >> >> > >>> >> >> > >>> On Mon, Apr 7, 2025 at 10:37 AM Sandy Ryza <sa...@apache.org >> >> > >>> <mailto:sa...@apache.org>> wrote: >> >> > >>> >> >> > >>>> Hi Khalid – the CLI in the current proposal will need to be built >> >> > >>>> on >> >> > >>>> top of internal APIs for constructing and launching pipeline >> >> > >>>> executions. >> >> > >>>> We'll have the option to expose these in the future. >> >> > >>>> >> >> > >>>> It would be worthwhile to understand the use cases in more depth >> >> > >>>> before >> >> > >>>> exposing these, because APIs are one-way doors and can be costly to >> >> > >>>> maintain. >> >> > >>>> >> >> > >>>> On Sat, Apr 5, 2025 at 11:59 PM Khalid Mammadov < >> >> > >>>> khalidmammad...@gmail.com <mailto:khalidmammad...@gmail.com>> >> >> > >>>> wrote: >> >> > >>>> >> >> > >>>>> Looks great! >> >> > >>>>> QQ: will user able to run this pipeline from normal code? I.e. >> >> > >>>>> can I >> >> > >>>>> trigger a pipeline from *driver* code based on some condition >> >> > >>>>> etc. or >> >> > >>>>> it must be executed via separate shell command ? >> >> > >>>>> As a background Databricks imposes similar limitation where as you >> >> > >>>>> cannot run normal Spark code and DLT on the same cluster for some >> >> > >>>>> reason >> >> > >>>>> and forces to use two clusters increasing the cost and latency. >> >> > >>>>> >> >> > >>>>> On Sat, 5 Apr 2025 at 23:03, Sandy Ryza <sa...@apache.org >> >> > >>>>> <mailto:sa...@apache.org>> wrote: >> >> > >>>>> >> >> > >>>>>> Hi all – starting a discussion thread for a SPIP that I've been >> >> > >>>>>> working on with Chao Sun, Kent Yao, Yuming Wang, and Jie Yang: >> >> > >>>>>> [JIRA >> >> > >>>>>> <https://issues.apache.org/jira/browse/SPARK-51727>] [Doc >> >> > >>>>>> <https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0> >> >> > >>>>>> ]. >> >> > >>>>>> >> >> > >>>>>> The SPIP proposes extending Spark's lazy, declarative execution >> >> > >>>>>> model >> >> > >>>>>> beyond single queries, to pipelines that keep multiple datasets >> >> > >>>>>> up to date. >> >> > >>>>>> It introduces the ability to compose multiple transformations >> >> > >>>>>> into a single >> >> > >>>>>> declarative dataflow graph. >> >> > >>>>>> >> >> > >>>>>> Declarative pipelines aim to simplify the development and >> >> > >>>>>> management >> >> > >>>>>> of data pipelines, by removing the need for manual >> >> > >>>>>> orchestration of >> >> > >>>>>> dependencies and making it possible to catch many errors before >> >> > >>>>>> any >> >> > >>>>>> execution steps are launched. >> >> > >>>>>> >> >> > >>>>>> Declarative pipelines can include both batch and streaming >> >> > >>>>>> computations, leveraging Structured Streaming for stream >> >> > >>>>>> processing and new >> >> > >>>>>> materialized view syntax for batch processing. Tight integration >> >> > >>>>>> with Spark >> >> > >>>>>> SQL's analyzer enables deeper analysis and earlier error >> >> > >>>>>> detection than is >> >> > >>>>>> achievable with more generic frameworks. >> >> > >>>>>> >> >> > >>>>>> Let us know what you think! >> >> > >>>>>> >> >> > >>>>>> >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> <mailto:dev-unsubscr...@spark.apache.org> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> <mailto:dev-unsubscr...@spark.apache.org> >>