[jira] [Updated] (FLINK-37227) Support Reuse Multiple Table Sinks in Planner

xiangyu feng (Jira) Sun, 26 Jan 2025 00:37:09 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-37227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


xiangyu feng updated FLINK-37227:
---------------------------------
    Description: 
When users are trying to partial-update a downstream table from multiple source 
tables, usually they will write multiple `insert into ...` in sql which will 
generate multiple datastreams in the running job. For downstream storage like 
datalakes, there will have concurrency issue for multiple writers ingesting 
data at the same time. Users in this case may have to manually union all 
datastreams in Flink SQL which is hard to use and maintain. 

It would be better if flink planer can support reuse the sink nodes across 
multiple table sinks. This would be a great usability improvement for users 
using partial-update features with datalake storages like Paimon.

  was:
When users are trying to partial-update a downstream table from multiple source 
tables, they have to submit a job within multiple datastreams. For downstream 
storage like data lakes, there will have concurrency issue for multiple writers 
ingesting data at the same time. Users in the case may have to manually union 
all datastreams in Flink SQL which is hard to use. 

It would be better if flink planer can support reuse the sink nodes across 
multiple datastreams. This would a great usability improvement for users using 
partial-update features.


> Support Reuse Multiple Table Sinks in Planner
> ---------------------------------------------
>
>                 Key: FLINK-37227
>                 URL: https://issues.apache.org/jira/browse/FLINK-37227
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>            Reporter: xiangyu feng
>            Priority: Major
>
> When users are trying to partial-update a downstream table from multiple 
> source tables, usually they will write multiple `insert into ...` in sql 
> which will generate multiple datastreams in the running job. For downstream 
> storage like datalakes, there will have concurrency issue for multiple 
> writers ingesting data at the same time. Users in this case may have to 
> manually union all datastreams in Flink SQL which is hard to use and 
> maintain. 
> It would be better if flink planer can support reuse the sink nodes across 
> multiple table sinks. This would be a great usability improvement for users 
> using partial-update features with datalake storages like Paimon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-37227) Support Reuse Multiple Table Sinks in Planner

Reply via email to