[jira] [Updated] (SPARK-51727) SPIP: Declarative Pipelines

Sandy Ryza (Jira) Sat, 05 Apr 2025 11:58:04 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-51727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sandy Ryza updated SPARK-51727:
-------------------------------
    Description: 
The SPIP proposes a new abstraction that combines multiple transformations into 
a single declarative dataflow graph, to simplify the development and management 
of data pipelines. 
 
The approach extends Spark's lazy, declarative execution model beyond single 
queries, to pipelines that keep multiple datasets up to date. This reduces 
cognitive overhead and the need for manual orchestration of dependencies.
 
Declarative pipelines can include both batch and streaming computations, 
leveraging Spark Streaming for stream processing and new materialized view 
syntax for batch processing.
 
SPIP doc: 
[https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0]

  was:
The SPIP proposes a new abstraction that combines multiple transformations into 
a single declarative dataflow graph, to simplify the development and management 
of data pipelines. 
 
The approach extends Spark's lazy, declarative execution model beyond single 
queries, to pipelines that keep multiple datasets up to date. This reduces 
cognitive overhead and manual orchestration of dependencies.
 
Declarative pipelines can include both batch and streaming computations, 
leveraging Spark Streaming for stream processing and new materialized view 
syntax for batch processing.
 
SPIP doc: 
https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0


> SPIP: Declarative Pipelines
> ---------------------------
>
>                 Key: SPARK-51727
>                 URL: https://issues.apache.org/jira/browse/SPARK-51727
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 4.1.0
>            Reporter: Sandy Ryza
>            Priority: Major
>
> The SPIP proposes a new abstraction that combines multiple transformations 
> into a single declarative dataflow graph, to simplify the development and 
> management of data pipelines. 
>  
> The approach extends Spark's lazy, declarative execution model beyond single 
> queries, to pipelines that keep multiple datasets up to date. This reduces 
> cognitive overhead and the need for manual orchestration of dependencies.
>  
> Declarative pipelines can include both batch and streaming computations, 
> leveraging Spark Streaming for stream processing and new materialized view 
> syntax for batch processing.
>  
> SPIP doc: 
> [https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-51727) SPIP: Declarative Pipelines

Reply via email to