[ https://issues.apache.org/jira/browse/SPARK-51727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sandy Ryza updated SPARK-51727: ------------------------------- Description: The SPIP proposes a new abstraction that combines multiple transformations into a single declarative dataflow graph, to simplify the development and management of data pipelines. The approach extends Spark's lazy, declarative execution model beyond single queries, to pipelines that keep multiple datasets up to date. This reduces cognitive overhead and the need for manual orchestration of dependencies. Declarative pipelines can include both batch and streaming computations, leveraging Spark Streaming for stream processing and new materialized view syntax for batch processing. SPIP doc: [https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0] was: The SPIP proposes a new abstraction that combines multiple transformations into a single declarative dataflow graph, to simplify the development and management of data pipelines. The approach extends Spark's lazy, declarative execution model beyond single queries, to pipelines that keep multiple datasets up to date. This reduces cognitive overhead and manual orchestration of dependencies. Declarative pipelines can include both batch and streaming computations, leveraging Spark Streaming for stream processing and new materialized view syntax for batch processing. SPIP doc: https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0 > SPIP: Declarative Pipelines > --------------------------- > > Key: SPARK-51727 > URL: https://issues.apache.org/jira/browse/SPARK-51727 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 4.1.0 > Reporter: Sandy Ryza > Priority: Major > > The SPIP proposes a new abstraction that combines multiple transformations > into a single declarative dataflow graph, to simplify the development and > management of data pipelines. > > The approach extends Spark's lazy, declarative execution model beyond single > queries, to pipelines that keep multiple datasets up to date. This reduces > cognitive overhead and the need for manual orchestration of dependencies. > > Declarative pipelines can include both batch and streaming computations, > leveraging Spark Streaming for stream processing and new materialized view > syntax for batch processing. > > SPIP doc: > [https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org