+1 Kent Yao
Sem <ssinche...@apache.org> 于2025年4月9日周三 14:08写道: > +1 (non-binding) > > > On April 9, 2025 7:29:40 AM GMT+02:00, Rishab Joshi <rishab99...@gmail.com> > wrote: > >> +1 Exciting. >> Rishab Joshi >> >> On Tue, Apr 8, 2025, 10:04 PM Ruifeng Zheng <ruife...@apache.org> wrote: >> >>> +1 >>> >>> On Wed, Apr 9, 2025 at 12:57 PM Denny Lee <denny.g....@gmail.com> wrote: >>> >>>> +1 (non-binding) >>>> >>>> On Tue, Apr 8, 2025 at 9:53 PM Yuming Wang <yumw...@apache.org> wrote: >>>> >>>>> +1 >>>>> >>>>> On Wed, Apr 9, 2025 at 10:47 AM Jungtaek Lim < >>>>> kabhwan.opensou...@gmail.com> wrote: >>>>> >>>>>> +1 looking forward to seeing this make progress! >>>>>> >>>>>> On Wed, Apr 9, 2025 at 11:32 AM Yang Jie <yangji...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> +1 >>>>>>> >>>>>>> On 2025/04/09 01:07:57 Hyukjin Kwon wrote: >>>>>>> > +1 >>>>>>> > >>>>>>> > I am actually pretty excited to have this. Happy to see this being >>>>>>> proposed. >>>>>>> > >>>>>>> > On Wed, 9 Apr 2025 at 01:55, Chao Sun <sunc...@apache.org> wrote: >>>>>>> > >>>>>>> > > +1. Super excited about this effort! >>>>>>> > > >>>>>>> > > On Tue, Apr 8, 2025 at 9:47 AM huaxin gao < >>>>>>> huaxin.ga...@gmail.com> wrote: >>>>>>> > > >>>>>>> > >> +1 I support this SPIP because it simplifies data pipeline >>>>>>> management and >>>>>>> > >> enhances error detection. >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> On Tue, Apr 8, 2025 at 9:33 AM Dilip Biswal <dkbis...@gmail.com> >>>>>>> wrote: >>>>>>> > >> >>>>>>> > >>> Excited to see this heading toward open source — materialized >>>>>>> views and >>>>>>> > >>> other features will bring a lot of value. >>>>>>> > >>> +1 (non-binding) >>>>>>> > >>> >>>>>>> > >>> On Mon, Apr 7, 2025 at 10:37 AM Sandy Ryza <sa...@apache.org> >>>>>>> wrote: >>>>>>> > >>> >>>>>>> > >>>> Hi Khalid – the CLI in the current proposal will need to be >>>>>>> built on >>>>>>> > >>>> top of internal APIs for constructing and launching pipeline >>>>>>> executions. >>>>>>> > >>>> We'll have the option to expose these in the future. >>>>>>> > >>>> >>>>>>> > >>>> It would be worthwhile to understand the use cases in more >>>>>>> depth before >>>>>>> > >>>> exposing these, because APIs are one-way doors and can be >>>>>>> costly to >>>>>>> > >>>> maintain. >>>>>>> > >>>> >>>>>>> > >>>> On Sat, Apr 5, 2025 at 11:59 PM Khalid Mammadov < >>>>>>> > >>>> khalidmammad...@gmail.com> wrote: >>>>>>> > >>>> >>>>>>> > >>>>> Looks great! >>>>>>> > >>>>> QQ: will user able to run this pipeline from normal code? >>>>>>> I.e. can I >>>>>>> > >>>>> trigger a pipeline from *driver* code based on some >>>>>>> condition etc. or >>>>>>> > >>>>> it must be executed via separate shell command ? >>>>>>> > >>>>> As a background Databricks imposes similar limitation where >>>>>>> as you >>>>>>> > >>>>> cannot run normal Spark code and DLT on the same cluster for >>>>>>> some reason >>>>>>> > >>>>> and forces to use two clusters increasing the cost and >>>>>>> latency. >>>>>>> > >>>>> >>>>>>> > >>>>> On Sat, 5 Apr 2025 at 23:03, Sandy Ryza <sa...@apache.org> >>>>>>> wrote: >>>>>>> > >>>>> >>>>>>> > >>>>>> Hi all – starting a discussion thread for a SPIP that I've >>>>>>> been >>>>>>> > >>>>>> working on with Chao Sun, Kent Yao, Yuming Wang, and Jie >>>>>>> Yang: [JIRA >>>>>>> > >>>>>> <https://issues.apache.org/jira/browse/SPARK-51727>] [Doc >>>>>>> > >>>>>> < >>>>>>> https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0 >>>>>>> > >>>>>>> > >>>>>> ]. >>>>>>> > >>>>>> >>>>>>> > >>>>>> The SPIP proposes extending Spark's lazy, declarative >>>>>>> execution model >>>>>>> > >>>>>> beyond single queries, to pipelines that keep multiple >>>>>>> datasets up to date. >>>>>>> > >>>>>> It introduces the ability to compose multiple >>>>>>> transformations into a single >>>>>>> > >>>>>> declarative dataflow graph. >>>>>>> > >>>>>> >>>>>>> > >>>>>> Declarative pipelines aim to simplify the development and >>>>>>> management >>>>>>> > >>>>>> of data pipelines, by removing the need for manual >>>>>>> orchestration of >>>>>>> > >>>>>> dependencies and making it possible to catch many errors >>>>>>> before any >>>>>>> > >>>>>> execution steps are launched. >>>>>>> > >>>>>> >>>>>>> > >>>>>> Declarative pipelines can include both batch and streaming >>>>>>> > >>>>>> computations, leveraging Structured Streaming for stream >>>>>>> processing and new >>>>>>> > >>>>>> materialized view syntax for batch processing. Tight >>>>>>> integration with Spark >>>>>>> > >>>>>> SQL's analyzer enables deeper analysis and earlier error >>>>>>> detection than is >>>>>>> > >>>>>> achievable with more generic frameworks. >>>>>>> > >>>>>> >>>>>>> > >>>>>> Let us know what you think! >>>>>>> > >>>>>> >>>>>>> > >>>>>> >>>>>>> > >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>> >>>>>>>