+1 On Wed, Apr 9, 2025 at 12:57 PM Denny Lee <denny.g....@gmail.com> wrote:
> +1 (non-binding) > > On Tue, Apr 8, 2025 at 9:53 PM Yuming Wang <yumw...@apache.org> wrote: > >> +1 >> >> On Wed, Apr 9, 2025 at 10:47 AM Jungtaek Lim < >> kabhwan.opensou...@gmail.com> wrote: >> >>> +1 looking forward to seeing this make progress! >>> >>> On Wed, Apr 9, 2025 at 11:32 AM Yang Jie <yangji...@apache.org> wrote: >>> >>>> +1 >>>> >>>> On 2025/04/09 01:07:57 Hyukjin Kwon wrote: >>>> > +1 >>>> > >>>> > I am actually pretty excited to have this. Happy to see this being >>>> proposed. >>>> > >>>> > On Wed, 9 Apr 2025 at 01:55, Chao Sun <sunc...@apache.org> wrote: >>>> > >>>> > > +1. Super excited about this effort! >>>> > > >>>> > > On Tue, Apr 8, 2025 at 9:47 AM huaxin gao <huaxin.ga...@gmail.com> >>>> wrote: >>>> > > >>>> > >> +1 I support this SPIP because it simplifies data pipeline >>>> management and >>>> > >> enhances error detection. >>>> > >> >>>> > >> >>>> > >> On Tue, Apr 8, 2025 at 9:33 AM Dilip Biswal <dkbis...@gmail.com> >>>> wrote: >>>> > >> >>>> > >>> Excited to see this heading toward open source — materialized >>>> views and >>>> > >>> other features will bring a lot of value. >>>> > >>> +1 (non-binding) >>>> > >>> >>>> > >>> On Mon, Apr 7, 2025 at 10:37 AM Sandy Ryza <sa...@apache.org> >>>> wrote: >>>> > >>> >>>> > >>>> Hi Khalid – the CLI in the current proposal will need to be >>>> built on >>>> > >>>> top of internal APIs for constructing and launching pipeline >>>> executions. >>>> > >>>> We'll have the option to expose these in the future. >>>> > >>>> >>>> > >>>> It would be worthwhile to understand the use cases in more depth >>>> before >>>> > >>>> exposing these, because APIs are one-way doors and can be costly >>>> to >>>> > >>>> maintain. >>>> > >>>> >>>> > >>>> On Sat, Apr 5, 2025 at 11:59 PM Khalid Mammadov < >>>> > >>>> khalidmammad...@gmail.com> wrote: >>>> > >>>> >>>> > >>>>> Looks great! >>>> > >>>>> QQ: will user able to run this pipeline from normal code? I.e. >>>> can I >>>> > >>>>> trigger a pipeline from *driver* code based on some condition >>>> etc. or >>>> > >>>>> it must be executed via separate shell command ? >>>> > >>>>> As a background Databricks imposes similar limitation where as >>>> you >>>> > >>>>> cannot run normal Spark code and DLT on the same cluster for >>>> some reason >>>> > >>>>> and forces to use two clusters increasing the cost and latency. >>>> > >>>>> >>>> > >>>>> On Sat, 5 Apr 2025 at 23:03, Sandy Ryza <sa...@apache.org> >>>> wrote: >>>> > >>>>> >>>> > >>>>>> Hi all – starting a discussion thread for a SPIP that I've been >>>> > >>>>>> working on with Chao Sun, Kent Yao, Yuming Wang, and Jie Yang: >>>> [JIRA >>>> > >>>>>> <https://issues.apache.org/jira/browse/SPARK-51727>] [Doc >>>> > >>>>>> < >>>> https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0 >>>> > >>>> > >>>>>> ]. >>>> > >>>>>> >>>> > >>>>>> The SPIP proposes extending Spark's lazy, declarative >>>> execution model >>>> > >>>>>> beyond single queries, to pipelines that keep multiple >>>> datasets up to date. >>>> > >>>>>> It introduces the ability to compose multiple transformations >>>> into a single >>>> > >>>>>> declarative dataflow graph. >>>> > >>>>>> >>>> > >>>>>> Declarative pipelines aim to simplify the development and >>>> management >>>> > >>>>>> of data pipelines, by removing the need for manual >>>> orchestration of >>>> > >>>>>> dependencies and making it possible to catch many errors >>>> before any >>>> > >>>>>> execution steps are launched. >>>> > >>>>>> >>>> > >>>>>> Declarative pipelines can include both batch and streaming >>>> > >>>>>> computations, leveraging Structured Streaming for stream >>>> processing and new >>>> > >>>>>> materialized view syntax for batch processing. Tight >>>> integration with Spark >>>> > >>>>>> SQL's analyzer enables deeper analysis and earlier error >>>> detection than is >>>> > >>>>>> achievable with more generic frameworks. >>>> > >>>>>> >>>> > >>>>>> Let us know what you think! >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>>